Botany 2021

Setup

Install

This workshop was made using R 3.6.2, we suggest you update your R. Prior to starting this workshop, you should have had the following packages installed.

## Make a list of packages
list_of_packages <- c("dplyr", 
                      "tidyr",
                      "plyr", 
                      "spocc", 
                      "ridigbio",
                      "tibble", 
                      "tidyverse",
                      "rbison",
                      "CoordinateCleaner",
                      "lubridate",
                      "ggplot2",
                      "gtools",
                      "raster", 
                      "sp", 
                      "spatstat", 
                      "spThin", 
                      "fields", 
                      "ggspatial", 
                      "rgdal", 
                      "rangeBuilder", 
                      "sf", 
                      "dismo", 
                      "devtools", 
                      "ENMeval", 
                      "caret", 
                      "usdm", 
                      "stringr", 
                      "factoextra", 
                      "FactoMineR", 
                      "multcompView", 
                      "ggsci",
                      "gridExtra", 
                      "ecospat", 
                      "rJava", 
                      "viridis", 
                      "ENMTools", 
                      "ape", 
                      "RStoolbox", 
                      "hypervolume", 
                      "phytools",
                      "picante")

## If you do not have these packages already installed, install them by executing the code below:
new.packages <- list_of_packages[!(list_of_packages %in% installed.packages()[,"Package"])]
if(length(new.packages)) install.packages(new.packages)

## Checking version of packages
### Compare to version demo was made with
versiondf <- read.csv("data/setup/setup.csv", stringsAsFactors = FALSE)
current_version <- as.character(do.call(c, lapply(list_of_packages, packageVersion)))
versiondf$current_version <- current_version
updatelist <- versiondf[which(versiondf$version != versiondf$current_version), ]
### Update packages with old versions
lapply(as.character(updatelist$packages), install.packages, ask = FALSE)

## Make sure all packages load
lapply(list_of_packages, require, character.only = TRUE)
# Install packages not in CRAN
library(devtools)
install_github('johnbaums/rmaxent')
install_github("marlonecobos/kuenm")

What are these packages?

tidyverse is a collection of R packages that are used for “everyday data analysis” including ggplot2, dplyr, tidyr, readr, purr, tibble, stringr, forcats. We use dplyr and ggplot2 throughout many of the R scripts.

ridigbio is an R package that queries the iDigBio API.

spocc queries data from GBIF, USGS’ Biodiversity Information Serving Our Nation (BISON), iNaturalist, Berkeley Ecoinformatics Engine, eBird, iDigBio, VertNet, Ocean Biogeographic Information System (OBIS), and Atlas of Living Australia (ALA).

RCurl and rjson are used to query an application programming interface (API), or an online database.

Geospatial packages include sp, raster, maptools, maps, rgdal, RStoolbox, and mapproj.

ggplot2 is used to plot and visualize data. ggbiplot is an add on to ggplot2.

ENMTools, ENMeval, and dismo are specific to ecological niche modeling.

hypervolume and caret have functions related to statistics.

phytools and picante are related to phylogenetics.

factoextra and FactoMiner are used for statistics related to a principal components analysis.



Troubleshooting

  • For spatstat make sure xcode is installed on OS.
system("xcode-select --install")
  • spThin and fields do not compile from source!

  • If you are still having issues installing ggbiplot, you may have to uninstall the package ‘backports’ and reset R.



This exercise requires the package ENMTools. This package requires Java (and the Oracle Java JDK) and that the maxent.jar file is in the folder that contains the dismo package (this package is a dependency of ENMTools). To find out where to put the jar file, run the command:

system.file("java", package="dismo")

Once you have your java and maxent file set, you are ready to go! If you have more issues, find some help here

To download the jar file to the right directory —

devtools::install_github("johnbaums/rmaxent")
rmaxent::get_maxent(version = "latest", quiet = FALSE)

The previous step may mask rgdal! Make sure to reload it!

For rJava, the newest package requires R >= 3.6.0
dismo only requires rJava >= 0.9-7, download 0.9.7 here

If you are experiencing errors with rJava:

## Remove rJava
remove.packages(rJava)
## Update your R Java configuration 
system("sudo R CMD javareconf")
## Restart R and check to see if the path matches
Sys.getenv("DYLD_FALLBACK_LIBRARY_PATH")
## Reinstall rJava
install.packages("rJava")
library(rJava)

If you are still having issues with rJava - check to see if you are using Java 12 (this should be included in the error message):

system("java -version")

The initial release of java 12 does not work - install an old version instead. To install an old version, navigate to the Oracle website or use homebrew.

system("brew cask install homebrew/cask-versions/java11")

Restart R and redo the previous steps. Required: uninstall the newer versions of java prior to repeating and restart R after.

## Uninstall Java 12, then repeat the following
## Remove rJava
remove.packages(rJava)
## Update your R Java configuration 
system("sudo R CMD javareconf")
## Restart R and check to see if the path matches
Sys.getenv("DYLD_FALLBACK_LIBRARY_PATH")
## Reinstall rJava
install.packages("rJava")
library(rJava)

More debugging: Check out jerrybreak’s comment from 12/30/2021.

If you have additional issues, please let us know!

Download Occurrence Data

Modified and created by ML Gaynor.

Load packages

library(dplyr) 
library(tidyr) 
library(plyr) 
library(spocc) 
library(ridigbio) 
library(tibble) 
library(rbison)

Load functions

This is a function I created with Natalie Patten. It will be part of her R package gatoRs (Geographic And Taxonomic Occurrence R-based Scrubbing).

source("functions/gators.R")

Data download from iDigBio

Search for the species Galax urceolata.

iDigBio_GU <- idig_search_records(rq=list(scientificname="Galax urceolata"))

Search for the family Diapensiaceae.

iDigBio_GU_family <- idig_search_records(rq=list(family="Diapensiaceae"), limit=1000)

What if you want to read in all the points for a family within an extent?

Hint: Use the iDigBio portal to determine the bounding box for your region of interest.

The bounding box delimits the geographic extent.

rq_input <- list("scientificname"=list("type"="exists"),
                 "family"="Diapensiaceae", 
                 geopoint=list(
                   type="geo_bounding_box",
                   top_left=list(lon = -98.16, lat = 48.92),
                   bottom_right=list(lon = -64.02, lat = 23.06)
                   )
                 )

Search using the input you just made

iDigBio_GU_family_USA <- idig_search_records(rq_input, limit=1000)

Save as csv files

write.csv(iDigBio_GU, "data/download/iDigBio_GU_20210614.csv",
          row.names = FALSE)
write.csv(iDigBio_GU_family, "data/download/iDigBio_GU_family_20210614.csv", 
          row.names = FALSE)

Data download using spocc_combined

Make synonym lists

Shortia_galacifolia <- c("Shortia galacifolia", "Sherwoodia galacifolia")
Galax_urceolata <- c("Galax urceolata", "Galax aphylla")
Pyxidanthera_barbulata <- c("Pyxidanthera barbulata","Pyxidanthera barbulata var. barbulata")
Pyxidanthera_brevifolia <- c("Pyxidanthera brevifolia", "Pyxidanthera barbulata var. brevifolia")

Use the spocc_combine function

This function downloads records for all names listed from iDigBio, GBIF, and BISON. It keeps specific columns from each database.

spocc_combine(Shortia_galacifolia, "data/download/raw/Shortia_galacifolia_raw_20210614.csv")
spocc_combine(Galax_urceolata, "data/download/raw/Galax_urceolata_raw_20210614.csv")
spocc_combine(Pyxidanthera_barbulata, "data/download/raw/Pyxidanthera_barbulata_raw_20210614.csv")
spocc_combine(Pyxidanthera_brevifolia, "data/download/raw/Pyxidanthera_brevifolia_raw_20210614.csv")
Here is a table of the columns returned for each species in spocc_combine:
Column Description
name scientific name, http://rs.tdwg.org/dwc/terms/scientificName
basis basis of record, http://rs.tdwg.org/dwc/terms/basisOfRecord
date event data, http://rs.tdwg.org/dwc/terms/eventDate
institutionID institution ID, http://rs.tdwg.org/dwc/terms/institutionID
collectionCode collection code, http://rs.tdwg.org/dwc/terms/collectionCode
collectionID collection ID, http://rs.tdwg.org/dwc/terms/collectionID
country country, http://rs.tdwg.org/dwc/terms/country
county county, http://rs.tdwg.org/dwc/terms/county
state stateprovince, http://rs.tdwg.org/dwc/terms/stateProvince
locality http://rs.tdwg.org/dwc/terms/locality or http://rs.tdwg.org/dwc/terms/verbatimLocality
Latitude http://rs.tdwg.org/dwc/terms/decimalLatitude
Longitude http://rs.tdwg.org/dwc/terms/decimalLongitude
ID idigbio = uuid, gbif = key, bison = occurrenceID
coordinateUncertaintyInMeters http://rs.tdwg.org/dwc/terms/coordinateUncertaintyInMeters
habitat http://rs.tdwg.org/dwc/iri/habitat
prov indicates who provided the data: gbif, bison, or idigbio
spocc_* these columns are returned from the spocc::occ2df function

Occurrence Data Cleaning

Modified and created by ML Gaynor.

Load Packages

library(dplyr)
library(tidyr)
library(raster)
library(sp)
library(spatstat)
library(spThin)
library(fields)
library(lubridate)
library(CoordinateCleaner)
library(ggplot2)
library(ggspatial)

Load functions

This is a function I created with Natalie Patten. It will be part of her R package gatoRs (Geographic And Taxonomic Occurrence R-based Scrubbing).

source("functions/gators.R")

Read in downloaded data frame

rawdf <- read.csv("data/download/raw/Shortia_galacifolia_raw_20210614.csv")

Cleaning

Inspect the data frame

What columns are included?

names(rawdf)
##  [1] "name"                          "basis"                        
##  [3] "date"                          "institutionID"                
##  [5] "collectionCode"                "collectionID"                 
##  [7] "country"                       "county"                       
##  [9] "state"                         "locality"                     
## [11] "Latitude"                      "Longitude"                    
## [13] "ID"                            "coordinateUncertaintyInMeters"
## [15] "informationWithheld"           "habitat"                      
## [17] "prov"                          "spocc.latitude"               
## [19] "spocc.longitude"               "spocc.prov"                   
## [21] "spocc.date"                    "spocc.name"

How many observations do we start with?

nrow(rawdf)
## [1] 791

1. Resolve taxon names

Inspect scientific names included in the raw df.

unique(rawdf$name)
## [1] shortia galacifolia                sherwoodia galacifolia            
## [3] Shortia galacifolia Torr. & A.Gray
## 3 Levels: sherwoodia galacifolia ... Shortia galacifolia Torr. & A.Gray

Create a list of accepted names based on the name column in your data frame

search <-  c("Shortia galacifolia", "Sherwoodia galacifolia")

Filter to only include accepted name:

df <-  filter_fix_names(rawdf, listofsynonyms = search, acceptedname = "Shortia galacifolia")
## [1] "List of synonyms:"
## [1] "Shortia galacifolia"    "Sherwoodia galacifolia"
## [1] "Shortia galacifolia|Sherwoodia galacifolia"
## [1] "Looking at unique names in raw df"
## [1] "shortia galacifolia"                "sherwoodia galacifolia"            
## [3] "Shortia galacifolia Torr. & A.Gray"
## [1] "Looking at unique names in cleaned df"
## [1] "shortia galacifolia"                "sherwoodia galacifolia"            
## [3] "Shortia galacifolia Torr. & A.Gray"

How many observations do we have now?

nrow(df)
## [1] 791

2. Decrease number of columns

Merge the two locality columns

df$Latitude <- dplyr::coalesce(df$Latitude, df$spocc.latitude)
df$Longitude <- dplyr::coalesce(df$Longitude, df$spocc.longitude)

If spocc didn’t download any lat/longs and there are 0 values in the spocc.latitude or spocc.longitude columns, skip this step
The two columns could have different classes, if so, find the fix here

Merge the two date columns

df$date <- dplyr::coalesce(df$date, df$spocc.date)

Subset the columns

df <- df %>%
      dplyr::select(ID = ID, 
                    name = new_name, 
                    basis = basis, 
                    coordinateUncertaintyInMeters = coordinateUncertaintyInMeters, 
                    informationWithheld = informationWithheld, 
                    lat = Latitude, 
                    long = Longitude, 
                    date = date)

3. Clean localities

Filtering out NA’s

df <- df %>%
      filter(!is.na(long)) %>%
      filter(!is.na(lat))

How many observations do we have now?

nrow(df)
## [1] 202

Precision

Round to two decimal places

df$lat <- round(df$lat, digits = 2)
df$long <- round(df$long, digits = 2)

Remove unlikely points

Remove points at 0.00, 0.00

df <- df %>%
      filter(long != 0.00) %>%
      filter(lat != 0.00)

Remove coordinates in cultivated zones, botanical gardens, and outside our desired range

df <- CoordinateCleaner::cc_inst(df, 
              lon = "long", 
              lat = "lat", 
              species = "name")
## Testing biodiversity institutions
## Removed 0 records.

Next, we look for geographic outliers and remove outliers.

df <- CoordinateCleaner::cc_outl(df, 
              lon = "long", 
              lat = "lat", 
              species = "name")
## Testing geographic outliers
## Removed 40 records.

How many observations do we have now?

nrow(df)
## [1] 161

4. Remove Duplicates

Fix dates

Parse dates into the same format

df$date <- lubridate::ymd(df$date)

Separate date into year, month, day

df <- df %>%
      mutate(year = lubridate::year(date), 
             month = lubridate::month(date), 
             day = lubridate::day(date))

Remove rows with identical lat, long, year, month, and day

df <- distinct(df, lat, long, year, month, day, .keep_all = TRUE)

How many observations do we have now?

nrow(df)
## [1] 157

5. Spatial Correction

Maxent will only retain one point per pixel. To make the ecological niche analysis comparable, we will retain only one point per pixel.

One point per pixel

Read in raster file
bio1 <- raster("data/climate_processing/bioclim/bio_1.tif")
Set resolution
rasterResolution <- max(res(bio1))
Filter

Remove a point which nearest neighbor distance is smaller than the resolution size (aka remove one point in a pair that occurs within one pixel)

while(min(nndist(df[,6:7])) < rasterResolution){
  nnD <- nndist(df[,6:7])
  df <- df[-(which(min(nnD) == nnD) [1]), ]
}

How many observations do we have now?

nrow(df)
## [1] 151

Spatial thinning

Reduce the effects of sampling bias using randomization approach.

Calculate minimum nearest neighbor distance in km
nnDm <- rdist.earth(as.matrix(data.frame(lon = df$long, lat = df$lat)), miles = FALSE, R = NULL)
nnDmin <- do.call(rbind, lapply(1:5, function(i) sort(nnDm[,i])[2]))
min(nnDmin)
## [1] 1.43369
Identify points to keep using spThin
keep <- spThin::thin(loc.data =  df, 
        verbose = FALSE, 
        long.col = "long", 
        lat.col = "lat",
        spec.col = "name",
        thin.par = 0.002, # Studies found 2m distance was enough to collect unique genets
        reps = 1, 
        locs.thinned.list.return = TRUE, 
        write.files = FALSE)[[1]]
Filter df to only include those lat/long that have been kept using spThin
df <- df %>%
       filter((lat %in% keep$Latitude +
                long %in% keep$Longitude) == 2)
nrow(df)
## [1] 151

6. Plot Cleaned Records

Set basemap

USA <- borders(database = "usa", colour = "gray50", fill = "gray50")
state <- borders(database = "state", colour = "black", fill = NA)

Plot

Here we are using ggplot2 and ggspatial to plot our occurrence records. We are using two basemaps, USA and state. The geom_point function adds our points based on long and lat, and colors them blue. We set the coordinate visualization space with coord_sf. We fixed the x and y labels. Finally, we added a scale and north arrow.

simple_map <- ggplot() +
              USA +
              state +
              geom_point(df, 
                         mapping = aes(x = long, y = lat), 
                         col = "blue") +
              coord_sf(xlim = c(min(df$long) - 3, max(df$long) + 3),
                       ylim = c(min(df$lat) - 3, max(df$lat) + 3)) +
              xlab("Longitude") +
              ylab("Latitude") +
              annotation_scale() +
              annotation_north_arrow(height = unit(1, "cm"), 
                                     width = unit(1, "cm"), 
                                     location = "tl")
simple_map
## Using plotunit = 'm'

7. Save Cleaned.csv

write.csv(df, "data/cleaning_demo/Shortia_galacifolia_20210618-cleaned.csv", row.names = FALSE)

8. Make maxent ready

Read in all cleaned files

alldf <- list.files("data/cleaning_demo/raw/", full.names = TRUE, 
                    recursive = FALSE, include.dirs = FALSE, pattern = "*.csv")
alldf <- lapply(alldf, read.csv)
alldf <- do.call(rbind, alldf)

Select needed columns

alldf <- alldf %>%
         dplyr::select(name, lat, long)

Save Maxent.csv

write.csv(alldf, "data/cleaning_demo/maxent_ready/diapensiaceae_maxentready_20210618.csv", row.names = FALSE)

Climate Processing

Climate layer processing.
Modified and created by ML Gaynor.
Based on code by Mike Belitz (mbelitz/Odo_SDM_Rproj).

Load Packages

library(raster)
library(gtools)
library(dplyr)
library(rgdal)
library(sp)
library(rangeBuilder)
library(sf)
library(caret)
library(usdm)
library(dismo)
library(stringr)

Load functions

Based on code by Mike Belitz (mbelitz/Odo_SDM_Rproj).

source("functions/VIFLayerSelect.R")

Load bioclim layers

biolist <- list.files("data/climate_processing/bioclim/", pattern = "*.tif", full.names = TRUE)

Order list using gtools.

biolist <- mixedsort(sort(biolist))

Load rasters as a stack.

biostack <- raster::stack(biolist)

Load occurrence records

And fix the name column class and make sure it is a character.

alldf <- read.csv("data/cleaning_demo/maxent_ready/diapensiaceae_maxentready_20210625.csv")
alldf$name <- as.character(alldf$name)

Present Layers - all

Here we will make the projection layers, or the layers which include shared space. First, we have to define the accessible space.

Make into a spatial point data frame

alldfsp <- alldf
coordinates(alldfsp) <- ~ long + lat
proj4string(alldfsp) <- CRS("+proj=longlat +datum=WGS84")

Create alpha hull

hull <- getDynamicAlphaHull(x = alldfsp@coords, 
                             fraction = 1, # min. fraction of records we want included
                             partCount = 1, # number of polygons
                             initialAlpha = 20, # initial alpha size, 20m
                             clipToCoast = "terrestrial",
                             proj = "+proj=longlat +datum=WGS84")

Visualize

plot(hull[[1]], col=transparentColor('gray50', 0.5), border = NA)
points(x = alldf$long, y = alldf$lat, cex = 0.5, pch = 3)

Add buffer to hull

Calculate buffer size

Here we take the 80th quantile of the max distance between points

buffDist <- quantile(x = (apply(spDists(alldfspTrans), 2,
                                FUN = function(x) sort(x)[2])), 
                     probs = 0.80, na.rm = TRUE) 
Buffer the hull
buffer_m <- buffer(x = hullTrans, width = buffDist, dissolve = TRUE)
buffer <- spTransform(buffer_m, "+proj=longlat +datum=WGS84 +no_defs +ellps=WGS84 +towgs84=0,0,0")
Visualize
plot(buffer, col=transparentColor('gray50', 0.5), border = NA)
points(x = alldf$long, y = alldf$lat, cex = 0.5, pch = 3)

Mask and crop bioclim layers

path <- "data/climate_processing/all/"
end <- ".asc"
for(i in 1:length(biolist)){
    # Subset raster layer
    rast <- biostack[[i]]
    # Setup file names
    name <- names(rast)
    out <- paste0(path, name)
    outfile <- paste0(out, end)
    # Crop and mask
    c <- crop(rast, extent(buffer))
    c <- mask(c, buffer)
    # Write raster
    writeRaster(c, outfile, format = "ascii", overwrite = TRUE)
}

Select layers for MaxEnt

We only want to include layers that are not highly correlated. To assess which layers we will include, we can look at the pearson correlation coefficient among layers.

Stack all layers

clippedlist <- list.files("data/climate_processing/all/", pattern = "*.asc", full.names = TRUE)
clippedlist <- mixedsort(sort(clippedlist))
clippedstack <- raster::stack(clippedlist)

Then calculate the correlation coefficient

corr <- layerStats(clippedstack, 'pearson', na.rm=TRUE)

Isolate only the pearson correlation coefficient and take absolute value.

c <- abs(corr$`pearson correlation coefficient`)

Write file and view in excel

write.csv(c, "data/climate_processing/correlationBioclim.csv", row.names = FALSE)

Highly correlated layers (> |0.80|) can impact the statistical significance of the niche models and therefore must be removed.

Randomly select variables to remove

envtCor <- mixedsort(sort(findCorrelation(c, cutoff = 0.80, names = TRUE, exact = TRUE)))
envtCor
##  [1] "bio_1"  "bio_3"  "bio_4"  "bio_6"  "bio_10" "bio_11" "bio_12" "bio_13"
##  [9] "bio_16" "bio_17" "bio_19"

Variable inflation factor (VIF)

VIF can detect for multicollinearity in a set of multiple regression variables. Run a simple maxent model for every species and calculate the average permutation contribution.

Run preliminary MaxEnt models

Loop through each species and save permutation importance in list.
Warning: This will print warnings even when it works fine.

set.seed(195)
m <- c()
for(i in  1:length(unique(alldf$name))){
    species <- unique(alldf$name)[i]
    spp_df <-  alldf %>%
               dplyr::filter(name == species)
    coordinates(spp_df) <- ~ long + lat
    model <- maxent(x = clippedstack, p = coordinates(spp_df), 
                    progress = "text", silent = FALSE) 
    m[[i]] <- vimportance(model)
}

Bind the dataframes

mc <- do.call(rbind, m)

Calculate the mean and rename columns

mc_average <- aggregate(mc[, 2], list(mc$Variables), mean)
mc_average <- mc_average %>%
              dplyr::select(Variables = Group.1, permutation.importance = x)
mc1 <- mc_average

Select Layers

Use VIF and the MaxEnt permutation importance to select the best variables for your model. Note, this leads to different layers when the models are rerun without setting seed due to permutations being random.

selectedlayers <- VIF_layerselect(clippedstack, mc_average)
mixedsort(sort(names(selectedlayers)))
Correct set for workshop purposes

Since this can vary per system (despite setting seed), we added this line to keep our files consistent for the workshop

sl <- c("bio_3", "bio_7", "bio_8", "bio_9", "bio_14", "bio_15", "bio_18", "elev")
selectedlayers <- raster::subset(clippedstack, sl)

Copy selected layers

Move selected layers to “Present_Layers/all/” for use in MaxEnt later.

for(i in 1:length(names(selectedlayers))){
    name <- names(selectedlayers)[i]
    from <- paste0("data/climate_processing/all/", name, ".asc")
    to <- paste0("data/climate_processing/PresentLayers/all/", name, ".asc")
    file.copy(from, to,
              overwrite = TRUE, recursive = FALSE, 
              copy.mode = TRUE)
}

Create Species Training Layers

for(i in 1:length(unique(alldf$name))){
    species <- unique(alldf$name)[i]
    # Subset species from data frame
    spp_df <-  alldf %>%
               dplyr::filter(name == species)
    # Make spatial
    coordinates(spp_df) <- ~ long + lat
    proj4string(spp_df) <- CRS("+proj=longlat +datum=WGS84")
    
    ## Create alpha hull
    sphull <- getDynamicAlphaHull(x = spp_df@coords, 
                                fraction = 1, # min. fraction of records we want included
                                partCount = 1, # number of polygons
                                initialAlpha = 20, # initial alpha size, 20m
                                clipToCoast = "terrestrial",
                                proj = "+proj=longlat +datum=WGS84")
    
    ### Transform into CRS related to meters
    sphullTrans <- spTransform(sphull[[1]], "+proj=cea +lat_ts=0 +lon_0=0")
    spp_dfTrans <- spTransform(spp_df, "+proj=cea +lat_ts=0 +lon_0")
    
    ### Calculate buffer size
    #### Here we take the 80th quantile of the max distance between points
    spbuffDist <- quantile(x = (apply(spDists(spp_dfTrans), 2, FUN = function(x) sort(x)[2])), 
                         probs = 0.80, na.rm = TRUE) 
    
    ### Buffer the hull
    spbuffer_m <- buffer(x = sphullTrans, width = spbuffDist, dissolve = TRUE)
    spbuffer <- spTransform(spbuffer_m,
                            "+proj=longlat +datum=WGS84 +no_defs +ellps=WGS84 +towgs84=0,0,0")
    ### Crop and Mask
    spec <- gsub(" ", "_", species)
    path <- paste0("data/climate_processing/PresentLayers/", spec,"/")
    end <- ".asc"
    for(j in 1:length(names(selectedlayers))){
        # Subset raster layer
        rast <- selectedlayers[[j]]
        # Setup file names
        name <- names(rast)
        out <- paste0(path, name)
        outfile <- paste0(out, end)
        # Crop and mask
        c <- crop(rast, extent(buffer))
        c <- mask(c, buffer)
        # Write raster
        writeRaster(c, outfile, format = "ascii", overwrite = TRUE)
    }
}

Point Based

Ecological analysis using points. This is based off scripts created by Hannah Owens and James Watling, and Anthony Melton.
Modified and created by ML Gaynor.

Load Packages

library(raster)
library(dplyr)
library(tidyr)
library(gtools)
library(factoextra)
library(FactoMineR)
library(multcompView)
library(ggsci)
library(gridExtra)
library(ggplot2)
library(ecospat)
library(dismo)

Load functions

This function is from vqv/ggbiplot.

source("functions/ggbiplot_copy.R")

Load data file

alldf <- read.csv("data/cleaning_demo/maxent_ready/diapensiaceae_maxentready_20210625.csv")

Raster layers

list <- list.files("data/climate_processing/PresentLayers/all", full.names = TRUE, recursive = FALSE) 
list <- mixedsort(sort(list))
envtStack <- stack(list)

ENM - Realized Niche

Extract value for each point

For each occurence record, extract the value for each bioclim variables using the package raster.

ptExtracted <- raster::extract(envtStack, alldf[3:2])

Convert to data frame

ptExtracteddf <- as.data.frame(ptExtracted)

Add species name

ptExtracteddf <- ptExtracteddf %>%
                 dplyr::mutate(name = as.character(alldf$name), x = alldf$long, y = alldf$lat)

Drop any NA

ptExtracteddf <- ptExtracteddf %>% 
                 tidyr::drop_na(bio_3, bio_7, bio_8, bio_9, bio_14, bio_15, bio_18, elev)

PCA

Create two dataframes.

data.bioclim <- ptExtracteddf[, 1:8]
data.species <- ptExtracteddf[, 9]

Using only the bioclim columns to run the principal components analysis.

data.pca <- prcomp(data.bioclim, scale. = TRUE) 

Understanding the PCA - Optional

When you use the command prcomp your loading variables show up as rotational variables. Thanks to a really great answer on stack overflow you can even convert the rotational variable to show the relative contribution.

loadings <- data.pca$rotation
summary(loadings)
##       PC1               PC2                PC3                PC4          
##  Min.   :-0.4467   Min.   :-0.45110   Min.   :-0.82811   Min.   :-0.30022  
##  1st Qu.:-0.3468   1st Qu.:-0.17283   1st Qu.:-0.23635   1st Qu.:-0.16254  
##  Median :-0.3119   Median : 0.13346   Median : 0.10057   Median : 0.01297  
##  Mean   :-0.1722   Mean   : 0.07061   Mean   :-0.05581   Mean   : 0.09998  
##  3rd Qu.:-0.1394   3rd Qu.: 0.29849   3rd Qu.: 0.15316   3rd Qu.: 0.33112  
##  Max.   : 0.4699   Max.   : 0.51779   Max.   : 0.33317   Max.   : 0.76997  
##       PC5                PC6                 PC7                PC8          
##  Min.   :-0.54202   Min.   :-0.793450   Min.   :-0.60781   Min.   :-0.65942  
##  1st Qu.:-0.15564   1st Qu.:-0.228231   1st Qu.:-0.40173   1st Qu.:-0.18065  
##  Median : 0.12133   Median :-0.077790   Median :-0.24135   Median :-0.07437  
##  Mean   : 0.09021   Mean   :-0.145197   Mean   :-0.20110   Mean   :-0.08703  
##  3rd Qu.: 0.40562   3rd Qu.:-0.005815   3rd Qu.:-0.03176   3rd Qu.: 0.02402  
##  Max.   : 0.48773   Max.   : 0.368720   Max.   : 0.33905   Max.   : 0.57647

There are two options to convert the loading to show the relative contribution, they both give the same answer so either can be used.

Option 1
loadings_relative_A <- t(t(abs(loadings))/rowSums(t(abs(loadings))))*100
loadings_relative_A
##              PC1       PC2       PC3       PC4        PC5        PC6       PC7
## bio_3  14.850656  6.414517 10.356458 33.544007  9.0085898  0.9942151 13.840312
## bio_7  17.109180  3.729291 12.053070 14.159247 19.1765473  1.8019146 24.811665
## bio_8   7.802604 20.544845  6.247980  4.496230  6.1800138 40.2763959  2.836495
## bio_9  11.887131  4.349147 37.772179 13.078961  6.9101503  8.0137969  8.857181
## bio_14 11.101071 18.191180  3.083048 10.036777 19.9384510 22.2996049  3.323073
## bio_15  9.369054 20.880380  6.091511  3.365956  0.9115813 18.7166455 10.846831
## bio_18 16.267425  9.200637 15.196677  6.096127 15.7167679  2.0413594 15.056066
## elev   11.612878 16.690003  9.199078 15.222695 22.1578987  5.8560679 20.428376
##               PC8
## bio_3   0.3835837
## bio_7   3.3281622
## bio_8   4.4111546
## bio_9   3.6289524
## bio_14 22.7072114
## bio_15 32.7980990
## bio_18 28.6727578
## elev    4.0700790
Option 2
loadings_relative_B <- sweep(x = abs(loadings), 
                             MARGIN = 2, 
                             STATS = colSums(abs(loadings)), FUN = "/")*100
loadings_relative_B
##              PC1       PC2       PC3       PC4        PC5        PC6       PC7
## bio_3  14.850656  6.414517 10.356458 33.544007  9.0085898  0.9942151 13.840312
## bio_7  17.109180  3.729291 12.053070 14.159247 19.1765473  1.8019146 24.811665
## bio_8   7.802604 20.544845  6.247980  4.496230  6.1800138 40.2763959  2.836495
## bio_9  11.887131  4.349147 37.772179 13.078961  6.9101503  8.0137969  8.857181
## bio_14 11.101071 18.191180  3.083048 10.036777 19.9384510 22.2996049  3.323073
## bio_15  9.369054 20.880380  6.091511  3.365956  0.9115813 18.7166455 10.846831
## bio_18 16.267425  9.200637 15.196677  6.096127 15.7167679  2.0413594 15.056066
## elev   11.612878 16.690003  9.199078 15.222695 22.1578987  5.8560679 20.428376
##               PC8
## bio_3   0.3835837
## bio_7   3.3281622
## bio_8   4.4111546
## bio_9   3.6289524
## bio_14 22.7072114
## bio_15 32.7980990
## bio_18 28.6727578
## elev    4.0700790

Plotting the PCA

Set theme

First, I made a theme to change the background of the plot. Next, I changed the plot margins and the text size.

theme <- theme(panel.background = element_blank(),
               panel.border=element_rect(fill=NA),
               panel.grid.major = element_blank(),
               panel.grid.minor = element_blank(),
               strip.background=element_blank(),
               axis.ticks=element_line(colour="black"),
               plot.margin=unit(c(1,1,1,1),"line"), 
               axis.text = element_text(size = 12), 
               legend.text = element_text(size = 12), 
               legend.title = element_text(size = 12),
               text = element_text(size = 12))
Set color palette
pal <- pal_locuszoom()(4)
ggbiplot

Next, use ggbiplot where obs.scale indicates the scale factor to apply to observation, var.scale indicates the scale factor to apply to variables, ellipse as TRUE draws a normal data ellipse for each group, and circle as TRUE draws a correlation circle.

g <- ggbiplot(data.pca, obs.scale = 1, var.scale = 1, 
              groups = data.species, ellipse = TRUE, circle = TRUE) +
     scale_color_manual(name = '', values = pal) +
     theme(legend.direction = 'vertical', legend.position = 'right', 
           legend.text = element_text(size = 12, face = "italic")) +
     theme
 
g


ANOVA

Simple function to run an ANOVA and a post-hoc Tukey-HSD test

stat.test <- function(data = ptExtracteddf, x = "name", y){
    bioaov <- aov(as.formula(paste0(y,"~",x)), data = data) 
    TH <- TukeyHSD(bioaov, "name")
    m <- multcompLetters(TH$name[,4])
    groups <- data.frame(groups = m$Letters, name = names(m$Letters))
    return(groups)
}

Run for bio_3 only

bio3aovplot <- ggplot(ptExtracteddf, aes(x = name, y = bio_3)) +
               geom_boxplot(aes(fill = name)) +
               scale_color_manual(name = '', values = pal) +
               geom_text(data = stat.test(y = "bio_3"), 
                        mapping = aes(x = name,
                                      y = max(ptExtracteddf["bio_3"]+1), 
                                      label = groups), 
                        size = 5, inherit.aes = FALSE) +
               theme(axis.text.x = element_text(angle = 90, size = 8, face = 'italic'))
bio3aovplot

Loop through all variables

variablelist <- colnames(ptExtracteddf)[1:8]
plotlist <- c()
for(i in 1:8){
  bio <- variablelist[i]
  tempdf <- ptExtracteddf %>%
            dplyr::select(name, variablelist[i])
  plotlist[[i]] <- ggplot(tempdf, aes(x = name, y = tempdf[,2])) +
                    geom_boxplot(aes(fill = name)) +
                    scale_colour_manual(name = 'Species', values = pal) +
                    geom_text(data = stat.test(y = variablelist[i]), 
                              mapping = aes(x = name,
                                            y = max(tempdf[,2]+1), 
                                            label = groups), 
                              size = 5, inherit.aes = FALSE) +
                    scale_x_discrete(labels = c('G', 'Pba','Pbr', 'S')) +
                    ggtitle(label = paste0(variablelist[i])) +
                    ylab(paste0(variablelist[i])) +
                    theme(legend.position = "none")
}

gridExtra::grid.arrange(grobs = plotlist)


Ecospat Niche Overlap and Niche Equivalency

Set up background points

bg1 <- randomPoints(mask = envtStack, n = 1000, p = alldf[,3:2])
bg1.env <- raster::extract(envtStack, bg1)
bg1.env <- data.frame(bg1.env)
allpt.bioclim <- rbind(bg1.env, data.bioclim)  

dudi.PCA to reduce variables

pca.env <- dudi.pca(allpt.bioclim,
                    center = TRUE, # Center by the mean
                    scannf = FALSE, # Don't plot
                    nf = 2) # Number of axis to keep 

Pull out scores for each species

p1.score <- suprow(pca.env, dplyr::filter(ptExtracteddf, name == "Galax urceolata")[, 1:8])$li
p2.score <- suprow(pca.env, dplyr::filter(ptExtracteddf, name == "Pyxidanthera barbulata")[, 1:8])$li
p3.score <- suprow(pca.env, dplyr::filter(ptExtracteddf, name == "Pyxidanthera brevifolia")[, 1:8])$li
p4.score <- suprow(pca.env, dplyr::filter(ptExtracteddf, name == "Shortia galacifolia")[, 1:8])$li
scores.clim <- pca.env$li
Visualize
plot(scores.clim, pch = 16, asp = 1,
     col = adjustcolor(1, alpha.f = 0.2), cex = 2,
     xlab = "PC1", ylab = "PC2") 
points(p1.score, pch = 18, col = pal[1], cex = 2)
points(p2.score, pch = 18, col = pal[2], cex = 2)
points(p3.score, pch = 18, col = pal[3], cex = 2)
points(p4.score, pch = 18, col = pal[4], cex = 2)

Kernel density estimates

Create occurrence density grids based on the ordination data.

z1 <- ecospat.grid.clim.dyn(scores.clim, scores.clim, p1.score, R = 100)
z2 <- ecospat.grid.clim.dyn(scores.clim, scores.clim, p2.score, R = 100)
z3 <- ecospat.grid.clim.dyn(scores.clim, scores.clim, p3.score, R = 100)
z4 <- ecospat.grid.clim.dyn(scores.clim, scores.clim, p4.score, R = 100)
zlist  <- list(z1, z2, z3, z4)

Niche Overlap

Schoener’s D ranges from 0 to 1. 0 represents no similarity between niche space. 1 represents completely identical niche space.

overlapD <- matrix(ncol = 2, nrow = 7)
n <- 1
for(i in 1:3){
  for(j in 2:4){
    if(i != j){
      overlapD[n, 1]<- paste0("z", i, "-", "z", j)
      overlapD[n, 2]<- ecospat.niche.overlap(zlist[[i]], zlist[[j]], cor = TRUE)$D
      n <- n + 1
    }
  }
}

overlapDdf <- data.frame(overlapD)
overlapDdf
##      X1                  X2
## 1 z1-z2   0.145352580822439
## 2 z1-z3 0.00131113812012462
## 3 z1-z4   0.518715321036678
## 4 z2-z3 0.00337172752511095
## 5 z2-z4   0.241888138587862
## 6 z3-z2 0.00337172752511095
## 7 z3-z4  0.0022848212591513
Niche Overlap Visualization
par(mfrow=c(1,2))
ecospat.plot.niche.dyn(z1, z4, quant=0.25, interest = 1
                       , title= "Niche Overlap - Z1 top", name.axis1="PC1", name.axis2="PC2")
ecospat.plot.niche.dyn(z1, z4, quant=0.25, interest = 2
                       , title= "Niche Overlap - Z4 top", name.axis1="PC1", name.axis2="PC2")

Niche Equivalency Test

Based on Warren et al. 2008 - Are the two niche identical?
Hypothesis test for D, null based on randomization. H1: the niche overlap is higher than expected by chance (or when randomized).

eq.test <- ecospat.niche.equivalency.test(z1, z4, rep = 10, alternative = "greater")
ecospat.plot.overlap.test(eq.test, "D", "Equivalency")

Niche Similarity Test

Based on Warren et al. 2008 - Are the two niche similar?
Can one species’ niche predict the occurrences of a second species better than expected by chance?

sim.test <- ecospat.niche.similarity.test(z1, z4, rep = 10, alternative = "greater", rand.type=2)
ecospat.plot.overlap.test(sim.test, "D", "Similarity")

Ecological Niche Modeling

Ecological Niche Modeling in R.
Modified and created by Anthony Melton and ML Gaynor.

This script is for generating and testing ENMs using ENMEval. Please see the paper describing ENMEval and their vignette.

Set up java memory

options(java.parameters = "- Xmx16g") # increase memory that can be used

Load Packages

library(raster)
library(gtools)
library(dplyr)
library(dismo)
library(ENMeval)
library(ggplot2)
library(viridis)

Load Function

source("functions/ENMevaluation.R")

Load data file

alldf <- read.csv("data/cleaning_demo/maxent_ready/diapensiaceae_maxentready_20210625.csv")

Subset for each species

Galax_urceolata <- dplyr::filter(alldf, name == "Galax urceolata")
Pyxidanthera_barbulata <- dplyr::filter(alldf, name == "Pyxidanthera barbulata")
Pyxidanthera_brevifolia <- dplyr::filter(alldf, name == "Pyxidanthera brevifolia")
Shortia_galacifolia <- dplyr::filter(alldf, name == "Shortia galacifolia")

Raster layers

list <- list.files("data/climate_processing/PresentLayers/all", full.names = TRUE, recursive = FALSE) 
list <- mixedsort(sort(list))
allstack <- stack(list)

Read in species training layers

gstack <- stack(mixedsort(sort(list.files("data/climate_processing/PresentLayers/Galax_urceolata/", full.names = TRUE))))
pbastack <- stack(mixedsort(sort(list.files("data/climate_processing/PresentLayers/Pyxidanthera_barbulata/", full.names = TRUE))))
pbrstack <- stack(mixedsort(sort(list.files("data/climate_processing/PresentLayers/Pyxidanthera_brevifolia/", full.names = TRUE))))
sstack <- stack(mixedsort(sort(list.files("data/climate_processing/PresentLayers/Shortia_galacifolia/", full.names = TRUE))))

Fix projection

projection(allstack) <- "+proj=longlat +ellps=WGS84 +towgs84=0,0,0,0,0,0,0 +no_defs"
projection(gstack) <- "+proj=longlat +ellps=WGS84 +towgs84=0,0,0,0,0,0,0 +no_defs"
projection(pbastack) <- "+proj=longlat +ellps=WGS84 +towgs84=0,0,0,0,0,0,0 +no_defs"
projection(pbrstack) <- "+proj=longlat +ellps=WGS84 +towgs84=0,0,0,0,0,0,0 +no_defs"
projection(sstack) <- "+proj=longlat +ellps=WGS84 +towgs84=0,0,0,0,0,0,0 +no_defs"

Model Generation

dismo Model Generations

Match MaxEnt GUI settings (see word document and powerpoint)

evaldis <- dismo::maxent(x = gstack, p = Galax_urceolata[, c("long", "lat")], nbg = 10000,
                         args = c("projectionlayers=data/climate_processing/PresentLayers/all",
                                  "responsecurves", "jackknife",  
                                  "outputformat=logistic","randomseed", 
                                  "randomtestpoints=25",  "replicates=5", 
                                  "replicatetype=subsample",  "maximumiterations=5000",
                                  "writebackgroundpredictions","responsecurvesexponent",
                                  "writeplotdata"), 
                         removeDuplicates = TRUE
                         #,path = "data/Ecological_Niche_Modeling/enm_output/Galax_urceolata/"
              )
## This is MaxEnt version 3.4.1

ENMeval Model generation

ENMeval will generate multiple models and test them per the specified test method.Two important variables here are the regularization multiplier (RM) value and the feature class (FC). FC will allow for different shapes in response curves (linear, hinge, quadratic, product, and threshold) can be used in the model and RM will influence how many parameters are included in the model.

eval1 <- ENMeval::ENMevaluate(occ = Galax_urceolata[, c("long", "lat")], 
                              env = gstack,
                              tune.args = list(fc = c("L","Q"), rm = 1:2), 
                              partitions = "block",
                              n.bg = 10000,
                              parallel = FALSE,
                              algorithm = 'maxent.jar', 
                              user.eval = proc)
## 
  |                                                                            
  |                                                                      |   0%
  |                                                                            
  |==================                                                    |  25%This is MaxEnt version 3.4.1 
## This is MaxEnt version 3.4.1 
## This is MaxEnt version 3.4.1 
## This is MaxEnt version 3.4.1 
## This is MaxEnt version 3.4.1 
## This is MaxEnt version 3.4.1 
## This is MaxEnt version 3.4.1 
## This is MaxEnt version 3.4.1 
## This is MaxEnt version 3.4.1 
## This is MaxEnt version 3.4.1 
## This is MaxEnt version 3.4.1 
## This is MaxEnt version 3.4.1 
## This is MaxEnt version 3.4.1 
## This is MaxEnt version 3.4.1 
## This is MaxEnt version 3.4.1 
## This is MaxEnt version 3.4.1 
## This is MaxEnt version 3.4.1 
## This is MaxEnt version 3.4.1 
## This is MaxEnt version 3.4.1 
## This is MaxEnt version 3.4.1 
## This is MaxEnt version 3.4.1 
## This is MaxEnt version 3.4.1 
## This is MaxEnt version 3.4.1 
## This is MaxEnt version 3.4.1 
## 
  |                                                                            
  |===================================                                   |  50%This is MaxEnt version 3.4.1 
## This is MaxEnt version 3.4.1 
## This is MaxEnt version 3.4.1 
## This is MaxEnt version 3.4.1 
## This is MaxEnt version 3.4.1 
## This is MaxEnt version 3.4.1 
## This is MaxEnt version 3.4.1 
## This is MaxEnt version 3.4.1 
## This is MaxEnt version 3.4.1 
## This is MaxEnt version 3.4.1 
## This is MaxEnt version 3.4.1 
## This is MaxEnt version 3.4.1 
## This is MaxEnt version 3.4.1 
## This is MaxEnt version 3.4.1 
## This is MaxEnt version 3.4.1 
## This is MaxEnt version 3.4.1 
## This is MaxEnt version 3.4.1 
## This is MaxEnt version 3.4.1 
## This is MaxEnt version 3.4.1 
## This is MaxEnt version 3.4.1 
## This is MaxEnt version 3.4.1 
## This is MaxEnt version 3.4.1 
## This is MaxEnt version 3.4.1 
## This is MaxEnt version 3.4.1 
## 
  |                                                                            
  |====================================================                  |  75%This is MaxEnt version 3.4.1 
## This is MaxEnt version 3.4.1 
## This is MaxEnt version 3.4.1 
## This is MaxEnt version 3.4.1 
## This is MaxEnt version 3.4.1 
## This is MaxEnt version 3.4.1 
## This is MaxEnt version 3.4.1 
## This is MaxEnt version 3.4.1 
## This is MaxEnt version 3.4.1 
## This is MaxEnt version 3.4.1 
## This is MaxEnt version 3.4.1 
## This is MaxEnt version 3.4.1 
## This is MaxEnt version 3.4.1 
## This is MaxEnt version 3.4.1 
## This is MaxEnt version 3.4.1 
## This is MaxEnt version 3.4.1 
## This is MaxEnt version 3.4.1 
## This is MaxEnt version 3.4.1 
## This is MaxEnt version 3.4.1 
## This is MaxEnt version 3.4.1 
## This is MaxEnt version 3.4.1 
## This is MaxEnt version 3.4.1 
## This is MaxEnt version 3.4.1 
## This is MaxEnt version 3.4.1 
## 
  |                                                                            
  |======================================================================| 100%This is MaxEnt version 3.4.1 
## This is MaxEnt version 3.4.1 
## This is MaxEnt version 3.4.1 
## This is MaxEnt version 3.4.1 
## This is MaxEnt version 3.4.1 
## This is MaxEnt version 3.4.1 
## This is MaxEnt version 3.4.1 
## This is MaxEnt version 3.4.1 
## This is MaxEnt version 3.4.1 
## This is MaxEnt version 3.4.1 
## This is MaxEnt version 3.4.1 
## This is MaxEnt version 3.4.1 
## This is MaxEnt version 3.4.1 
## This is MaxEnt version 3.4.1 
## This is MaxEnt version 3.4.1 
## This is MaxEnt version 3.4.1 
## This is MaxEnt version 3.4.1 
## This is MaxEnt version 3.4.1 
## This is MaxEnt version 3.4.1 
## This is MaxEnt version 3.4.1 
## This is MaxEnt version 3.4.1 
## This is MaxEnt version 3.4.1 
## This is MaxEnt version 3.4.1 
## This is MaxEnt version 3.4.1 
## 
## This is MaxEnt version 3.4.1 
## This is MaxEnt version 3.4.1 
## This is MaxEnt version 3.4.1 
## This is MaxEnt version 3.4.1 
## This is MaxEnt version 3.4.1 
## This is MaxEnt version 3.4.1 
## This is MaxEnt version 3.4.1 
## This is MaxEnt version 3.4.1

Model Statistics

dismo

Inspect the dismo model based on the html

browseURL(evaldis@html)

ENMeval

Inspect the many models

Visualize

maps <- eval1@predictions
plot(maps)

Look at model overlap

mod_overlap <- calc.niche.overlap(eval1@predictions, overlapStat = "D")
## 
  |                                                                            
  |                                                                      |   0%
  |                                                                            
  |=======================                                               |  33%
  |                                                                            
  |===============================================                       |  67%
  |                                                                            
  |======================================================================| 100%
kable(mod_overlap) %>%
   kable_styling(bootstrap_options = c("striped", "hover", "condensed", font_size = 10)) %>%
   scroll_box(width = "100%", height = "200px")
fc.L_rm.1 fc.Q_rm.1 fc.L_rm.2 fc.Q_rm.2
fc.L_rm.1 NA NA NA NA
fc.Q_rm.1 0.9536825 NA NA NA
fc.L_rm.2 0.9850919 0.9522876 NA NA
fc.Q_rm.2 0.9515567 0.9863397 0.95379 NA

Inspect the results

Identify the best model selecting models with the lowest average test omission rate and the highest average validation AUC

results <- eval.results(eval1)
opt.seq <- results %>% 
            dplyr::filter(or.10p.avg == min(or.10p.avg)) %>% 
            dplyr::filter(auc.val.avg == max(auc.val.avg))
kable(opt.seq)  %>%
kable_styling(bootstrap_options = c("striped", "hover", "condensed", font_size = 10)) %>%
   scroll_box(width = "100%", height = "200px")
fc rm tune.args auc.train cbi.train auc.diff.avg auc.diff.sd auc.val.avg auc.val.sd cbi.val.avg cbi.val.sd or.10p.avg or.10p.sd or.mtp.avg or.mtp.sd proc_auc_ratio.avg proc_auc_ratio.sd proc_pval.avg proc_pval.sd AICc delta.AICc w.AIC ncoef
Q 2 fc.Q_rm.2 0.7779521 0.844 0.145921 0.0987648 0.7840933 0.1477396 0.77225 0.1242293 0.1527778 0.2373334 0.0138889 0.0277778 1.961035 0.0121882 0 0 1858.097 5.63308 0.0391711 7

Subset model

mod.seq <- eval.models(eval1)[[opt.seq$tune.args]]

Inspect

kable(mod.seq@results) %>%
  kable_styling(bootstrap_options = c("striped", "hover", "condensed", font_size = 10)) %>%
  scroll_box(width = "100%", height = "200px")
X.Training.samples 72.0000
Regularized.training.gain 0.6478
Unregularized.training.gain 0.7199
Iterations 200.0000
Training.AUC 0.7760
X.Background.points 10070.0000
bio_14.contribution 21.1306
bio_15.contribution 1.9885
bio_18.contribution 2.6430
bio_3.contribution 15.7797
bio_7.contribution 4.4342
bio_8.contribution 1.7713
bio_9.contribution 1.0230
elev.contribution 51.2297
bio_14.permutation.importance 0.0000
bio_15.permutation.importance 33.4249
bio_18.permutation.importance 4.7675
bio_3.permutation.importance 28.0884
bio_7.permutation.importance 33.6544
bio_8.permutation.importance 0.0647
bio_9.permutation.importance 0.0000
elev.permutation.importance 0.0000
Entropy 8.5695
Prevalence..average.probability.of.presence.over.background.sites. 0.3160
Fixed.cumulative.value.1.cumulative.threshold 1.0000
Fixed.cumulative.value.1.Cloglog.threshold 0.0779
Fixed.cumulative.value.1.area 0.9181
Fixed.cumulative.value.1.training.omission 0.0139
Fixed.cumulative.value.5.cumulative.threshold 5.0000
Fixed.cumulative.value.5.Cloglog.threshold 0.1730
Fixed.cumulative.value.5.area 0.7646
Fixed.cumulative.value.5.training.omission 0.0556
Fixed.cumulative.value.10.cumulative.threshold 10.0000
Fixed.cumulative.value.10.Cloglog.threshold 0.2321
Fixed.cumulative.value.10.area 0.6498
Fixed.cumulative.value.10.training.omission 0.1111
Minimum.training.presence.cumulative.threshold 0.9924
Minimum.training.presence.Cloglog.threshold 0.0778
Minimum.training.presence.area 0.9185
Minimum.training.presence.training.omission 0.0000
X10.percentile.training.presence.cumulative.threshold 7.5963
X10.percentile.training.presence.Cloglog.threshold 0.2064
X10.percentile.training.presence.area 0.7004
X10.percentile.training.presence.training.omission 0.0972
Equal.training.sensitivity.and.specificity.cumulative.threshold 34.3959
Equal.training.sensitivity.and.specificity.Cloglog.threshold 0.3760
Equal.training.sensitivity.and.specificity.area 0.2974
Equal.training.sensitivity.and.specificity.training.omission 0.2917
Maximum.training.sensitivity.plus.specificity.cumulative.threshold 56.1734
Maximum.training.sensitivity.plus.specificity.Cloglog.threshold 0.5194
Maximum.training.sensitivity.plus.specificity.area 0.0998
Maximum.training.sensitivity.plus.specificity.training.omission 0.4028
Balance.training.omission..predicted.area.and.threshold.value.cumulative.threshold 0.9924
Balance.training.omission..predicted.area.and.threshold.value.Cloglog.threshold 0.0778
Balance.training.omission..predicted.area.and.threshold.value.area 0.9185
Balance.training.omission..predicted.area.and.threshold.value.training.omission 0.0000
Equate.entropy.of.thresholded.and.original.distributions.cumulative.threshold 17.3506
Equate.entropy.of.thresholded.and.original.distributions.Cloglog.threshold 0.2860
Equate.entropy.of.thresholded.and.original.distributions.area 0.5231
Equate.entropy.of.thresholded.and.original.distributions.training.omission 0.1944

Look at variable contribution

plot(mod.seq)

Look at the response curves

dismo::response(mod.seq)
## This is MaxEnt version 3.4.1
## This is MaxEnt version 3.4.1
## This is MaxEnt version 3.4.1
## This is MaxEnt version 3.4.1
## This is MaxEnt version 3.4.1
## This is MaxEnt version 3.4.1
## This is MaxEnt version 3.4.1
## This is MaxEnt version 3.4.1

Project model to allstack

p <- predict(mod.seq, allstack) 
## This is MaxEnt version 3.4.1

Visualize

# Make into a data frame
p_df <-  as.data.frame(p, xy = TRUE)

# Plot
ggplot() +
  geom_raster(data = p_df, aes(x = x, y = y, fill = layer)) +
  geom_point(data= Galax_urceolata, 
             mapping = aes(x = long, y = lat), 
             col='red', cex=0.05) +
  coord_quickmap() +
  theme_bw() + 
  scale_fill_gradientn(colours = viridis::viridis(99),
                       na.value = "black")


Save outputs

R saved dataset

save(mod.seq, file = "data/Ecological_Niche_Modeling/enm_output/ENMeval/GalaxENM.rda")

Save Raster

writeRaster(x = p, filename = "data/Ecological_Niche_Modeling/enm_output/ENMeval/GalaxENM.asc",
            format = "ascii", NAFlag = "-9999", overwrite = T)

ENM Processing

Processing Ecological Niche Models in R.
Script by Anthony Melton and ML Gaynor.

Load Packages

library(raster)
library(gtools)
library(dplyr)
library(ENMTools)
library(ENMeval)
library(ape)
library(RStoolbox)
library(hypervolume)
library(phytools)

Load functions

The following command generates a binary predicted occurrence map. This was written by Anthony Melton.

source("functions/Functions_AEM.R")

Read in models you generated with the MaxEnt GUI into R.

sp1_enm.mx.b <- raster("data/Ecological_Niche_Modeling/enm_output/Galax_urceolata_avg.asc")
sp2_enm.mx.b <- raster("data/Ecological_Niche_Modeling/enm_output/Pyxidanthera_barbulata_avg.asc")
sp3_enm.mx.b <- raster("data/Ecological_Niche_Modeling/enm_output/Pyxidanthera_brevifolia_avg.asc")
sp4_enm.mx.b <- raster("data/Ecological_Niche_Modeling/enm_output/Shortia_galacifolia_avg.asc")

Read in Bioclim layers

These steps are described in the previous script

system("rm -rf data/climate_processing/PresentLayers/all/maxent.cache/")
list <- list.files("data/climate_processing/PresentLayers/all/", 
                   full.names = TRUE, recursive = FALSE) 
list <- mixedsort(sort(list))
allstack <- stack(list)
projection(allstack) <- "+proj=longlat +ellps=WGS84 +towgs84=0,0,0,0,0,0,0 +no_defs"

ENM Breadth

Niche breadth is the breadth of environmental factors for a species’ niche, ranging from 0 to 1. When breadth is closer to 1 the more generalist species with wider tolerances. Values closer to 0 indicate a more specialized species. The raster.breadth command in ENMTools measures the smoothness of suitability scores across a projected landscape. The higher the score, the more of the available niche space a species occupies.

sp1_breadth <- ENMTools::raster.breadth(x = sp1_enm.mx.b)
sp1_breadth$B2
## [1] 0.6369572
sp2_breadth <- ENMTools::raster.breadth(x = sp2_enm.mx.b)
sp2_breadth$B2
## [1] 0.1665568
sp3_breadth <- ENMTools::raster.breadth(x = sp3_enm.mx.b)
sp3_breadth$B2
## [1] 0.04380163
sp4_breadth <- ENMTools::raster.breadth(x = sp4_enm.mx.b)
sp4_breadth$B2
## [1] 0.4567

ENM Overlap

Calculating niche overlap, Schoener’s D, with ENMEval - Schoener’s D ranges from 0 to 1, where 0 represents no similarity between the projections and 1 represents completely identical projections.

Stack the projections and make sure the stack is named.

enm_stack.b <- stack(sp1_enm.mx.b, sp2_enm.mx.b, sp3_enm.mx.b, sp4_enm.mx.b)
names(enm_stack.b) <- c("Galax urceolata", "Pyxidanthera barbulata",  "Pyxidanthera brevifolia","Shortia galacifolia" )

Calculate

calc.niche.overlap(enm_stack.b, overlapStat = "D")
## 
  |                                                                            
  |                                                                      |   0%
  |                                                                            
  |=======================                                               |  33%
  |                                                                            
  |===============================================                       |  67%
  |                                                                            
  |======================================================================| 100%
##                         Galax.urceolata Pyxidanthera.barbulata
## Galax.urceolata                      NA                     NA
## Pyxidanthera.barbulata        0.2108513                     NA
## Pyxidanthera.brevifolia       0.1287981              0.1614858
## Shortia.galacifolia           0.6447093              0.1228339
##                         Pyxidanthera.brevifolia Shortia.galacifolia
## Galax.urceolata                              NA                  NA
## Pyxidanthera.barbulata                       NA                  NA
## Pyxidanthera.brevifolia                      NA                  NA
## Shortia.galacifolia                  0.03923816                  NA

Phylogenetic correlations

Let’s look at niche overlap over a tree!

Load the tree file

tree <- ape::read.tree(file = 
                         "data/Ecological_Niche_Modeling/AOC_Test_Demo/diapensiaceae_subset.tre")
tree <- ape::compute.brlen(phy = tree, method = "grafen")
plot(tree)

Drop the outgroup

tree <- drop.tip(tree, "Cyrilla_racemiflora")
plot(tree)

Generate species objects for each tree member!

sp1 <- enmtools.species(species.name = "Galax_urceolata",
                        presence.points = Galax_urceolata[,3:2])
sp1$range <- background.raster.buffer(sp1$presence.points, 25000, mask = allstack)
sp1$background.points = background.points.buffer(points = sp1$presence.points, radius = 5000, n = 10000, mask =  allstack[[1]])
##############################
sp2 <- enmtools.species(species.name = "Pyxidanthera_barbulata",
                                        presence.points = Pyxidanthera_barbulata[,3:2])
sp2$range <- background.raster.buffer(sp2$presence.points, 25000, mask = allstack)
sp2$background.points = background.points.buffer(points = sp2$presence.points, radius = 5000, n = 10000, mask =  allstack[[1]])
#############################
sp3 <- enmtools.species(species.name = "Pyxidanthera_brevifolia",
                                        presence.points = Pyxidanthera_brevifolia[,3:2])
sp3$range <- background.raster.buffer(sp3$presence.points, 25000, mask = allstack)
sp3$background.points = background.points.buffer(points = sp3$presence.points, radius = 5000, n = 10000, mask =  allstack[[1]])
#############################
sp4 <- enmtools.species(species.name = "Shortia_galacifolia",
                                          presence.points = Shortia_galacifolia[,3:2])
sp4$range <- background.raster.buffer(sp4$presence.points, 25000, mask = allstack)
sp4$background.points = background.points.buffer(points = sp4$presence.points, radius = 5000, n = 10000, mask =  allstack[[1]])

Create “clade” object with all the species in the tree

clade=enmtools.clade(species = list(sp1, sp2, sp3, sp4), tree = tree)
check.clade(clade)
## 
## 
## An enmtools.clade object with 4 species
## 
## Species names: 
##   Shortia_galacifolia     Pyxidanthera_brevifolia     Pyxidanthera_barbulata      Galax_urceolata
## 
## Tree: 
## 
## Phylogenetic tree with 4 tips and 3 internal nodes.
## 
## Tip labels:
##   Shortia_galacifolia, Pyxidanthera_brevifolia, Pyxidanthera_barbulata, Galax_urceolata
## 
## Rooted; includes branch lengths.
## 
## 
## Data Summary: 
## <table>
##  <thead>
##   <tr>
##    <th style="text-align:left;">   </th>
##    <th style="text-align:left;"> species.names </th>
##    <th style="text-align:left;"> in.tree </th>
##    <th style="text-align:left;"> presence </th>
##    <th style="text-align:left;"> background </th>
##    <th style="text-align:left;"> range </th>
##   </tr>
##  </thead>
## <tbody>
##   <tr>
##    <td style="text-align:left;"> Shortia_galacifolia </td>
##    <td style="text-align:left;"> Shortia_galacifolia </td>
##    <td style="text-align:left;"> TRUE </td>
##    <td style="text-align:left;"> 12 </td>
##    <td style="text-align:left;"> 10000 </td>
##    <td style="text-align:left;"> present </td>
##   </tr>
##   <tr>
##    <td style="text-align:left;"> Pyxidanthera_brevifolia </td>
##    <td style="text-align:left;"> Pyxidanthera_brevifolia </td>
##    <td style="text-align:left;"> TRUE </td>
##    <td style="text-align:left;"> 33 </td>
##    <td style="text-align:left;"> 10000 </td>
##    <td style="text-align:left;"> present </td>
##   </tr>
##   <tr>
##    <td style="text-align:left;"> Pyxidanthera_barbulata </td>
##    <td style="text-align:left;"> Pyxidanthera_barbulata </td>
##    <td style="text-align:left;"> TRUE </td>
##    <td style="text-align:left;"> 234 </td>
##    <td style="text-align:left;"> 10000 </td>
##    <td style="text-align:left;"> present </td>
##   </tr>
##   <tr>
##    <td style="text-align:left;"> Galax_urceolata </td>
##    <td style="text-align:left;"> Galax_urceolata </td>
##    <td style="text-align:left;"> TRUE </td>
##    <td style="text-align:left;"> 72 </td>
##    <td style="text-align:left;"> 10000 </td>
##    <td style="text-align:left;"> present </td>
##   </tr>
## </tbody>
## </table>

Age-Range Correlation Test

range.aoc <- enmtools.aoc(clade = clade,  nreps = 10, overlap.source = "range")
## Warning: `guides(<scale> = FALSE)` is deprecated. Please use `guides(<scale> =
## "none")` instead.

## Warning: `guides(<scale> = FALSE)` is deprecated. Please use `guides(<scale> =
## "none")` instead.
summary(range.aoc)
## 
## 
## Age-Overlap Correlation test
## 
## 10 replicates 
## 
## p values:
##      (Intercept) empirical.df$age 
##        0.1363636        0.1363636

## `stat_bin()` using `bins = 30`. Pick better value with `binwidth`.
## `stat_bin()` using `bins = 30`. Pick better value with `binwidth`.

## NULL

Age-Overlap Correlation Test

glm.aoc <- enmtools.aoc(clade = clade,  env = allstack, nreps = 10, overlap.source = "glm")
## Warning: `guides(<scale> = FALSE)` is deprecated. Please use `guides(<scale> =
## "none")` instead.

## Warning: `guides(<scale> = FALSE)` is deprecated. Please use `guides(<scale> =
## "none")` instead.
summary(glm.aoc)
## 
## 
## Age-Overlap Correlation test
## 
## 10 replicates 
## 
## p values:
##      (Intercept) empirical.df$age 
##              0.5              0.5

## `stat_bin()` using `bins = 30`. Pick better value with `binwidth`.
## `stat_bin()` using `bins = 30`. Pick better value with `binwidth`.

## NULL

The results here are pretty meaningless since we’re looking at very few species, but it serves the purpose of the demo. For range AOCs, an intercept >0.5 and negative slope are indicative of sympatric species, while an intercept of <0.5 and a positive slope are indicative on non-sympatric speciation. A low intercept and positive slope for niche overlap would indicate niche divergence. ***

Hypervolume

This will generate a binary map with any cell that contains a suitability score greater than or equal to the lowest score with an occurrence point as a presence. There are other ways to set the threshold, including percentiles or training statistics.

Load occurrence data

alldf <- read.csv("data/cleaning_demo/maxent_ready/diapensiaceae_maxentready_20210625.csv")

Subset for each species

Galax_urceolata <- dplyr::filter(alldf, name == "Galax urceolata")
Pyxidanthera_barbulata <- dplyr::filter(alldf, name == "Pyxidanthera barbulata")
Pyxidanthera_brevifolia <- dplyr::filter(alldf, name == "Pyxidanthera brevifolia")
Shortia_galacifolia <- dplyr::filter(alldf, name == "Shortia galacifolia")

Create the binary maps

sp1.dist <- make.binary.map(model = sp1_enm.mx.b, occ.dat = Galax_urceolata)
sp2.dist <- make.binary.map(model = sp2_enm.mx.b, occ.dat = Pyxidanthera_barbulata)
sp3.dist <- make.binary.map(model = sp3_enm.mx.b, occ.dat = Pyxidanthera_brevifolia)
sp4.dist <- make.binary.map(model = sp4_enm.mx.b, occ.dat = Shortia_galacifolia)

Plot

par(mfrow = c(2,2),  mar = c(1,1,1,1))
plot(sp1.dist)
plot(sp2.dist)
plot(sp3.dist)
plot(sp4.dist)

Hypervolume

Next, let’s work on getting some data from the predicted distributions! Niche space can be thought of as a multi-dimensional hypervolume. We’re using climatic data in this case, so we’re measuring the hypervolume of climatic niche space occupied by these species.
Warning: This takes forever!

sp1.hv <- get_hypervolume(binary_projection = sp1.dist, envt = allstack)
sp2.hv <- get_hypervolume(binary_projection = sp2.dist, envt = allstack)
sp3.hv <- get_hypervolume(binary_projection = sp3.dist, envt = allstack)
sp4.hv <- get_hypervolume(binary_projection = sp4.dist, envt = allstack)

Compare the hypervolumes

hv_set <- hypervolume_set(hv1 = sp1.hv, hv2 = sp2.hv, check.memory = F)
hypervolume_overlap_statistics(hv_set)
plot(hv_set)
get_volume(hv_set)

Phylogenetic Diversity

Original script by J. Janzten.
Modified by ML Gaynor.

Load Packages

library(raster)
library(ape)
library(phytools)
library(picante)
library(ggplot2)
library(dplyr)
library(tidyverse)
library(rgdal)
library(rgeos)
library(sp)

Load the tree file

tree <- ape::read.tree(file = "data/Ecological_Niche_Modeling/AOC_Test_Demo/diapensiaceae_subset.tre")
tree <- ape::compute.brlen(phy = tree, method = "grafen")
plot(tree)

Drop the outgroup

tree <- drop.tip(tree, "Cyrilla_racemiflora")

Read ENM models

sp1_enm.mx.b <- raster("data/Ecological_Niche_Modeling/enm_output/Galax_urceolata_avg.asc")
sp2_enm.mx.b <- raster("data/Ecological_Niche_Modeling/enm_output/Pyxidanthera_barbulata_avg.asc")
sp3_enm.mx.b <- raster("data/Ecological_Niche_Modeling/enm_output/Pyxidanthera_brevifolia_avg.asc")
sp4_enm.mx.b <- raster("data/Ecological_Niche_Modeling/enm_output/Shortia_galacifolia_avg.asc")

Reclassify rasters

reclassify_raster <- function(OGraster){
  ## Reclassify the raster by the suitability score
  OGraster[OGraster >= 0.25] <- 1
  OGraster[OGraster < 0.25] <- 0
  OGraster[is.na(OGraster)] <- 0
  crs(OGraster) <- "+proj=longlat +datum=WGS84 +no_defs +ellps=WGS84 +towgs84=0,0,0"
  return(OGraster)
}
sp1_enm.mx.b  <- reclassify_raster(sp1_enm.mx.b)
sp2_enm.mx.b  <- reclassify_raster(sp2_enm.mx.b)
sp3_enm.mx.b  <- reclassify_raster(sp3_enm.mx.b)
sp4_enm.mx.b  <- reclassify_raster(sp4_enm.mx.b)

Inspect the reclassified layers

par(mfrow=c(2,2))
plot(sp1_enm.mx.b)
plot(sp2_enm.mx.b)
plot(sp3_enm.mx.b)
plot(sp4_enm.mx.b)

Stack rasters and create dataframe

enm_stack <- stack(sp1_enm.mx.b, sp2_enm.mx.b, sp3_enm.mx.b, sp4_enm.mx.b)
names(enm_stack) <- c("Galax_urceolata", "Pyxidanthera_barbulata",  "Pyxidanthera_brevifolia","Shortia_galacifolia" )

Convert to dataframe

rasterstack_df <- as.data.frame(enm_stack, xy = TRUE)

Extract Ecoregions for points

Load USA shapefile

Obtained at https://www.epa.gov/eco-research/level-iii-and-iv-ecoregions-continental-united-states

shp <- ("data/phylogenetic_diversity/USraster/us_eco_l4/us_eco_l4_no_st.shp")
USAshape <- readOGR(shp, layer="us_eco_l4_no_st")
## OGR data source with driver: ESRI Shapefile 
## Source: "/Users/shellygaynor/Dropbox/Workshops/BotanyENMWorkshop2021/Demos/Rbased/CrashCourse/data/phylogenetic_diversity/USraster/us_eco_l4/us_eco_l4_no_st.shp", layer: "us_eco_l4_no_st"
## with 5896 features
## It has 16 fields

Correct CRS

newcrs <- "+proj=longlat +datum=WGS84 +no_defs +ellps=WGS84 +towgs84=0,0,0"
USAshape2 <- spTransform(USAshape, newcrs)

Convert dataframe to points

df.sp <- rasterstack_df[,1:2]
coordinates(df.sp)<-~x+y
crs(df.sp) <- "+proj=longlat +datum=WGS84 +no_defs +ellps=WGS84 +towgs84=0,0,0"

Extract points from shapefile

Warning: this will take a while!!

df.sp.info <- sp::over(df.sp, USAshape2)

Bind dataframe with points and with ecosystems

rasterstack_df.info <- cbind(df.sp.info, rasterstack_df)
rasterstack_df.info2 <- rasterstack_df.info %>%
                        dplyr::group_by(L4_KEY) %>% 
                        dplyr::summarize(Galax_urceolata = max(Galax_urceolata),
                                         Pyxidanthera_barbulata = max(Pyxidanthera_barbulata),
                                         Pyxidanthera_brevifolia = max(Pyxidanthera_brevifolia), 
                                         Shortia_galacifolia = max(Shortia_galacifolia))
rasterstack_df.info2df <- as.data.frame(rasterstack_df.info2, row.names = FALSE)

Removing NA

rasterstack_df.info4 <- rasterstack_df.info2df[1:137,]

Removing the first column

rasterstack_df.info3 <- rasterstack_df.info4[,-1]

Renaming the rows

row.names(rasterstack_df.info3) <- NULL
row.names(rasterstack_df.info3) <- rasterstack_df.info4[,1]

kable(rasterstack_df.info3)%>%
  kable_styling(bootstrap_options = c("striped", "hover", "condensed", font_size = 10)) %>%
  scroll_box(width = "100%", height = "200px")
Galax_urceolata Pyxidanthera_barbulata Pyxidanthera_brevifolia Shortia_galacifolia
45a Southern Inner Piedmont 1 0 0 1
45b Southern Outer Piedmont 1 0 1 1
45c Carolina Slate Belt 1 0 1 1
45d Talladega Upland 1 0 0 0
45e Northern Inner Piedmont 1 0 0 1
45f Northern Outer Piedmont 0 0 1 1
45g Triassic Basins 1 0 1 1
45i Kings Mountain 1 0 0 1
54a Illinois/Indiana Prairies 0 0 0 0
54b Chicago Lake Plain 0 0 0 0
54c Kankakee Marsh 0 0 0 0
54d Sand Area 0 0 0 0
54f Valparaiso-Wheaton Morainal Complex 0 0 0 0
55a Clayey High Lime Till Plains 0 0 0 0
55b Loamy High Lime Till Plains 0 0 0 0
55c Mad River Interlobate Area 0 0 0 0
55d Pre-Wisconsinan Drift Plains 0 0 0 0
55e Darby Plains 0 0 0 0
55f Whitewater Interlobate Area 0 0 0 0
56a Northern Indiana Lake Country 0 0 0 0
56b Battle Creek/Elkhart Outwash Plain 0 0 0 0
56c Middle Tippecanoe Plains 0 0 0 0
56d Michigan Lake Plain 0 0 0 0
57a Maumee Lake Plain 0 0 0 0
57b Oak Openings 0 0 0 0
57c Paulding Plains 0 0 0 0
57d Marblehead Drift/Limestone Plain 0 0 0 0
58b Western New England Marble Valleys 0 0 0 0
58e Berkshire Transition 0 0 0 0
58h Reading Prong 0 0 0 0
58i Glaciated Reading Prong/Hudson Highlands 0 0 0 0
58x Taconic Foothills 0 0 0 0
59a Connecticut Valley 0 0 0 1
59c Southern New England Coastal Plains and Hills 0 1 0 1
59g Long Island Sound Coastal Lowland 0 1 0 1
59i Hudson Valley 0 0 0 0
60a Glaciated Low Allegheny Plateau 0 0 0 0
60b Delaware-Neversink Highlands 0 0 0 0
61b Mosquito Creek/Pymatuning Lowlands 0 0 0 0
61c Low Lime Drift Plain 0 0 0 0
61d Erie Gorges 0 0 0 0
61e Summit Interlobate Area 0 0 0 0
62a Pocono High Plateau 0 0 0 0
62b Low Poconos 0 0 0 0
62c Glaciated High Allegheny Plateau 0 0 0 0
62d Unglaciated High Allegheny Plateau 0 0 0 0
63a Delaware River Terraces and Uplands 0 1 0 1
63b Chesapeake-Pamlico Lowlands and Tidal Marshes 1 1 0 1
63c Swamps and Peatlands 1 1 0 0
63d Virginian Barrier Islands and Coastal Marshes 1 1 0 0
63e Mid-Atlantic Flatwoods 1 1 0 0
63f Delmarva Uplands 0 1 0 0
63g Carolinian Barrier Islands and Coastal Marshes 1 1 1 0
63h Carolina Flatwoods 1 1 1 0
63n Mid-Atlantic Floodplains and Low Terraces 1 1 0 0
64a Triassic Lowlands 0 1 0 1
64b Trap Rock and Conglomerate Uplands 0 1 0 1
64c Piedmont Uplands 0 1 0 0
64d Piedmont Limestone/Dolomite Lowlands 0 1 0 0
64e Glaciated Triassic Lowlands 0 1 0 0
64f Passaic Basin Freshwater Wetlands 0 1 0 0
64g Hackensack Meadowlands 0 1 0 0
65c Sand Hills 1 1 1 1
65i Fall Line Hills 0 0 0 0
65j Transition Hills 0 0 0 0
65l Atlantic Southern Loam Plains 1 1 1 0
65m Rolling Coastal Plain 1 1 1 1
65n Chesapeake Rolling Coastal Plain 0 1 0 1
65p Southeastern Floodplains and Low Terraces 1 0 1 0
66a Northern Igneous Ridges 1 0 0 1
66b Northern Sedimentary and Metasedimentary Ridges 1 0 0 1
66c New River Plateau 1 0 0 1
66d Southern Crystalline Ridges and Mountains 1 0 0 1
66e Southern Sedimentary Ridges 1 0 0 1
66f Limestone Valleys and Coves 1 0 0 1
66g Southern Metasedimentary Mountains 1 0 0 1
66i High Mountains 1 0 0 1
66j Broad Basins 1 0 0 1
66k Amphibolite Mountains 1 0 0 1
66l Eastern Blue Ridge Foothills 1 0 0 1
66m Sauratown Mountains 1 0 0 1
67a Northern Limestone/Dolomite Valleys 1 0 0 1
67b Northern Shale Valleys 1 0 0 1
67c Northern Sandstone Ridges 1 0 0 1
67d Northern Dissected Ridges and Knobs 1 0 0 1
67e Anthracite Subregion 0 0 0 0
67f Southern Limestone/Dolomite Valleys and Low Rolling Hills 1 0 0 1
67g Southern Shale Valleys 1 0 0 1
67h Southern Sandstone Ridges 1 0 0 1
67i Southern Dissected Ridges and Knobs 1 0 0 1
67j Northern Glaciated Limestone Valleys 0 0 0 0
67k Northern Glaciated Shale and Slate Valleys 0 0 0 0
67l Northern Glaciated Limestone Ridges, Valleys, and Terraces 0 0 0 0
67m Northern Glaciated Ridges 0 0 0 0
68a Cumberland Plateau 1 0 0 1
68b Sequatchie Valley 1 0 0 0
68c Plateau Escarpment 1 0 0 1
68d Southern Table Plateaus 1 0 0 0
68e Dissected Plateau 0 0 0 0
68f Shale Hills 0 0 0 0
69a Forested Hills and Mountains 1 0 0 1
69b Uplands and Valleys of Mixed Land Use 0 0 0 0
69c Greenbriar Karst 1 0 0 1
69d Dissected Appalachian Plateau 1 0 0 1
69e Cumberland Mountain Thrust Block 1 0 0 1
70a Permian Hills 0 0 0 0
70b Monongahela Transition Zone 1 0 0 0
70c Pittsburgh Low Plateau 0 0 0 0
70d Knobs-Lower Scioto Dissected Plateau 0 0 0 0
70e Unglaciated Upper Muskingum Basin 0 0 0 0
70f Ohio/Kentucky Carboniferous Plateau 1 0 0 0
70g Northern Forested Plateau Escarpment 0 0 0 0
70h Carter Hills 0 0 0 0
71a Crawford-Mammoth Cave Uplands 0 0 0 0
71b Mitchell Plain 0 0 0 0
71c Knobs-Norman Upland 0 0 0 0
71d Outer Bluegrass 0 0 0 0
71e Western Pennyroyal Karst Plain 0 0 0 0
71f Western Highland Rim 0 0 0 0
71g Eastern Highland Rim 1 0 0 1
71h Outer Nashville Basin 1 0 0 0
71i Inner Nashville Basin 0 0 0 0
71j Little Mountain 0 0 0 0
71k Hills of the Bluegrass 0 0 0 0
71l Inner Bluegrass 0 0 0 0
72a Wabash-Ohio Bottomlands 0 0 0 0
72b Glaciated Wabash Lowlands 0 0 0 0
72c Green River-Southern Wabash Lowlands 0 0 0 0
72h Caseyville Hills 0 0 0 0
72j Southern Illinoian Till Plain 0 0 0 0
72m Wabash River Bluffs and Low Hills 0 0 0 0
75j Sea Islands/Coastal Marsh 0 0 0 0
83a Erie/Ontario Lake Plain 0 0 0 0
84a Cape Cod/Long Island 0 0 0 0
84b Pine Barrens 0 1 0 0
84c Barrier Islands/Coastal Marshes 0 1 0 0
84d Inner Coastal Plain 0 1 0 0

Phylogenetic Diversity Calculations

Match tree and dataframe

matched <- match.phylo.comm(tree, rasterstack_df.info3)

Make object for tree including matching taxa only (non-matching taxa will be pruned out)

matchtree <- matched$phy

Make object for dataset including matching taxa only (non-matching taxa removed)

matchcomm <- matched$comm 

Calculate phylogenetic distance matrix for use in MPD and MNTD calculations

MPD stands for mean pairwise distance, or mean phylogenetic distance among all pairs of species within a community. MNTD stands for mean nearest taxon distance, or the mean distance between each species within a community and its closest relative.

phydist <- cophenetic(matched$phy)

Calculate indices using picante

Null model options include taxa.labels (shuffle tips of phylogeny) among others. Use ses.pd, ses.mpd and ses.mntd for more info on alternative null models and other parameters and output format. Number of runs - only using 99 runs (comparing to null model) due to time constraints.

PD calculates Faith’s PD and standard effect size of PD

PD stands for phylogenetic diversity

pd_result <-ses.pd(matchcomm, matchtree, null.model="taxa.labels", runs=99)

Calculates mpd and ses mpd - equivalent to -NRI

NRI stands for net-relatedness index, standardizes MPD or mean pairwise distance. ses stands for standardized effect size. If we had abundance data, we could include that for mpd and mntd. The default is abundance.weighted = FALSE.

mpd_result <- ses.mpd(matchcomm, phydist, null.model="taxa.labels", abundance.weighted = FALSE, runs=99)

Calculates mntd and ses mntd - equivalent to -NTI

NTI stands for nearest taxon index, standardizes MNTD mean nearest taxon distance. ses stands for standardized effect size.

mntd_result <- ses.mntd(matchcomm, phydist, null.model="taxa.labels", runs=99)

Look at results

Positive ses values (obs.z values) and p values > 0.95 indicate overdispersion (greater than expected).Negative ses values (obs.z values) and p values < 0.05 indicate clustering (less than expected). Values not significantly different from zero indicate taxa in community randomly distributed across tree.

kable(pd_result) %>%
  kable_styling(bootstrap_options = c("striped", "hover", "condensed", font_size = 10)) %>%
  scroll_box(width = "100%", height = "200px")
ntaxa pd.obs pd.rand.mean pd.rand.sd pd.obs.rank pd.obs.z pd.obs.p runs
45a Southern Inner Piedmont 2 1.50 1.318182 0.1883677 77.5 0.9652304 0.775 99
45b Southern Outer Piedmont 3 2.00 1.810606 0.2232954 73.0 0.8481765 0.730 99
45c Carolina Slate Belt 3 2.00 1.810606 0.2232954 73.0 0.8481765 0.730 99
45d Talladega Upland 1 0.75 0.750000 0.0000000 50.5 NaN 0.505 99
45e Northern Inner Piedmont 2 1.50 1.318182 0.1883677 77.5 0.9652304 0.775 99
45f Northern Outer Piedmont 2 1.25 1.338384 0.1648791 33.0 -0.5360524 0.330 99
45g Triassic Basins 3 2.00 1.810606 0.2232954 73.0 0.8481765 0.730 99
45i Kings Mountain 2 1.50 1.318182 0.1883677 77.5 0.9652304 0.775 99
54a Illinois/Indiana Prairies 0 0.00 0.000000 0.0000000 50.5 NaN 0.505 99
54b Chicago Lake Plain 0 0.00 0.000000 0.0000000 50.5 NaN 0.505 99
54c Kankakee Marsh 0 0.00 0.000000 0.0000000 50.5 NaN 0.505 99
54d Sand Area 0 0.00 0.000000 0.0000000 50.5 NaN 0.505 99
54f Valparaiso-Wheaton Morainal Complex 0 0.00 0.000000 0.0000000 50.5 NaN 0.505 99
55a Clayey High Lime Till Plains 0 0.00 0.000000 0.0000000 50.5 NaN 0.505 99
55b Loamy High Lime Till Plains 0 0.00 0.000000 0.0000000 50.5 NaN 0.505 99
55c Mad River Interlobate Area 0 0.00 0.000000 0.0000000 50.5 NaN 0.505 99
55d Pre-Wisconsinan Drift Plains 0 0.00 0.000000 0.0000000 50.5 NaN 0.505 99
55e Darby Plains 0 0.00 0.000000 0.0000000 50.5 NaN 0.505 99
55f Whitewater Interlobate Area 0 0.00 0.000000 0.0000000 50.5 NaN 0.505 99
56a Northern Indiana Lake Country 0 0.00 0.000000 0.0000000 50.5 NaN 0.505 99
56b Battle Creek/Elkhart Outwash Plain 0 0.00 0.000000 0.0000000 50.5 NaN 0.505 99
56c Middle Tippecanoe Plains 0 0.00 0.000000 0.0000000 50.5 NaN 0.505 99
56d Michigan Lake Plain 0 0.00 0.000000 0.0000000 50.5 NaN 0.505 99
57a Maumee Lake Plain 0 0.00 0.000000 0.0000000 50.5 NaN 0.505 99
57b Oak Openings 0 0.00 0.000000 0.0000000 50.5 NaN 0.505 99
57c Paulding Plains 0 0.00 0.000000 0.0000000 50.5 NaN 0.505 99
57d Marblehead Drift/Limestone Plain 0 0.00 0.000000 0.0000000 50.5 NaN 0.505 99
58b Western New England Marble Valleys 0 0.00 0.000000 0.0000000 50.5 NaN 0.505 99
58e Berkshire Transition 0 0.00 0.000000 0.0000000 50.5 NaN 0.505 99
58h Reading Prong 0 0.00 0.000000 0.0000000 50.5 NaN 0.505 99
58i Glaciated Reading Prong/Hudson Highlands 0 0.00 0.000000 0.0000000 50.5 NaN 0.505 99
58x Taconic Foothills 0 0.00 0.000000 0.0000000 50.5 NaN 0.505 99
59a Connecticut Valley 1 0.75 0.750000 0.0000000 50.5 NaN 0.505 99
59c Southern New England Coastal Plains and Hills 2 1.25 1.325758 0.2003013 35.5 -0.3782182 0.355 99
59g Long Island Sound Coastal Lowland 2 1.25 1.325758 0.2003013 35.5 -0.3782182 0.355 99
59i Hudson Valley 0 0.00 0.000000 0.0000000 50.5 NaN 0.505 99
60a Glaciated Low Allegheny Plateau 0 0.00 0.000000 0.0000000 50.5 NaN 0.505 99
60b Delaware-Neversink Highlands 0 0.00 0.000000 0.0000000 50.5 NaN 0.505 99
61b Mosquito Creek/Pymatuning Lowlands 0 0.00 0.000000 0.0000000 50.5 NaN 0.505 99
61c Low Lime Drift Plain 0 0.00 0.000000 0.0000000 50.5 NaN 0.505 99
61d Erie Gorges 0 0.00 0.000000 0.0000000 50.5 NaN 0.505 99
61e Summit Interlobate Area 0 0.00 0.000000 0.0000000 50.5 NaN 0.505 99
62a Pocono High Plateau 0 0.00 0.000000 0.0000000 50.5 NaN 0.505 99
62b Low Poconos 0 0.00 0.000000 0.0000000 50.5 NaN 0.505 99
62c Glaciated High Allegheny Plateau 0 0.00 0.000000 0.0000000 50.5 NaN 0.505 99
62d Unglaciated High Allegheny Plateau 0 0.00 0.000000 0.0000000 50.5 NaN 0.505 99
63a Delaware River Terraces and Uplands 2 1.25 1.325758 0.2003013 35.5 -0.3782182 0.355 99
63b Chesapeake-Pamlico Lowlands and Tidal Marshes 3 2.00 1.803030 0.2028896 77.5 0.9708220 0.775 99
63c Swamps and Peatlands 2 1.50 1.348485 0.1849510 73.0 0.8192179 0.730 99
63d Virginian Barrier Islands and Coastal Marshes 2 1.50 1.348485 0.1849510 73.0 0.8192179 0.730 99
63e Mid-Atlantic Flatwoods 2 1.50 1.348485 0.1849510 73.0 0.8192179 0.730 99
63f Delmarva Uplands 1 0.75 0.750000 0.0000000 50.5 NaN 0.505 99
63g Carolinian Barrier Islands and Coastal Marshes 3 1.75 1.820707 0.1989783 36.5 -0.3553507 0.365 99
63h Carolina Flatwoods 3 1.75 1.820707 0.1989783 36.5 -0.3553507 0.365 99
63n Mid-Atlantic Floodplains and Low Terraces 2 1.50 1.348485 0.1849510 73.0 0.8192179 0.730 99
64a Triassic Lowlands 2 1.25 1.325758 0.2003013 35.5 -0.3782182 0.355 99
64b Trap Rock and Conglomerate Uplands 2 1.25 1.325758 0.2003013 35.5 -0.3782182 0.355 99
64c Piedmont Uplands 1 0.75 0.750000 0.0000000 50.5 NaN 0.505 99
64d Piedmont Limestone/Dolomite Lowlands 1 0.75 0.750000 0.0000000 50.5 NaN 0.505 99
64e Glaciated Triassic Lowlands 1 0.75 0.750000 0.0000000 50.5 NaN 0.505 99
64f Passaic Basin Freshwater Wetlands 1 0.75 0.750000 0.0000000 50.5 NaN 0.505 99
64g Hackensack Meadowlands 1 0.75 0.750000 0.0000000 50.5 NaN 0.505 99
65c Sand Hills 4 2.25 2.250000 0.0000000 50.5 NaN 0.505 99
65i Fall Line Hills 0 0.00 0.000000 0.0000000 50.5 NaN 0.505 99
65j Transition Hills 0 0.00 0.000000 0.0000000 50.5 NaN 0.505 99
65l Atlantic Southern Loam Plains 3 1.75 1.820707 0.1989783 36.5 -0.3553507 0.365 99
65m Rolling Coastal Plain 4 2.25 2.250000 0.0000000 50.5 NaN 0.505 99
65n Chesapeake Rolling Coastal Plain 2 1.25 1.325758 0.2003013 35.5 -0.3782182 0.355 99
65p Southeastern Floodplains and Low Terraces 2 1.50 1.328283 0.1877854 76.0 0.9144331 0.760 99
66a Northern Igneous Ridges 2 1.50 1.318182 0.1883677 77.5 0.9652304 0.775 99
66b Northern Sedimentary and Metasedimentary Ridges 2 1.50 1.318182 0.1883677 77.5 0.9652304 0.775 99
66c New River Plateau 2 1.50 1.318182 0.1883677 77.5 0.9652304 0.775 99
66d Southern Crystalline Ridges and Mountains 2 1.50 1.318182 0.1883677 77.5 0.9652304 0.775 99
66e Southern Sedimentary Ridges 2 1.50 1.318182 0.1883677 77.5 0.9652304 0.775 99
66f Limestone Valleys and Coves 2 1.50 1.318182 0.1883677 77.5 0.9652304 0.775 99
66g Southern Metasedimentary Mountains 2 1.50 1.318182 0.1883677 77.5 0.9652304 0.775 99
66i High Mountains 2 1.50 1.318182 0.1883677 77.5 0.9652304 0.775 99
66j Broad Basins 2 1.50 1.318182 0.1883677 77.5 0.9652304 0.775 99
66k Amphibolite Mountains 2 1.50 1.318182 0.1883677 77.5 0.9652304 0.775 99
66l Eastern Blue Ridge Foothills 2 1.50 1.318182 0.1883677 77.5 0.9652304 0.775 99
66m Sauratown Mountains 2 1.50 1.318182 0.1883677 77.5 0.9652304 0.775 99
67a Northern Limestone/Dolomite Valleys 2 1.50 1.318182 0.1883677 77.5 0.9652304 0.775 99
67b Northern Shale Valleys 2 1.50 1.318182 0.1883677 77.5 0.9652304 0.775 99
67c Northern Sandstone Ridges 2 1.50 1.318182 0.1883677 77.5 0.9652304 0.775 99
67d Northern Dissected Ridges and Knobs 2 1.50 1.318182 0.1883677 77.5 0.9652304 0.775 99
67e Anthracite Subregion 0 0.00 0.000000 0.0000000 50.5 NaN 0.505 99
67f Southern Limestone/Dolomite Valleys and Low Rolling Hills 2 1.50 1.318182 0.1883677 77.5 0.9652304 0.775 99
67g Southern Shale Valleys 2 1.50 1.318182 0.1883677 77.5 0.9652304 0.775 99
67h Southern Sandstone Ridges 2 1.50 1.318182 0.1883677 77.5 0.9652304 0.775 99
67i Southern Dissected Ridges and Knobs 2 1.50 1.318182 0.1883677 77.5 0.9652304 0.775 99
67j Northern Glaciated Limestone Valleys 0 0.00 0.000000 0.0000000 50.5 NaN 0.505 99
67k Northern Glaciated Shale and Slate Valleys 0 0.00 0.000000 0.0000000 50.5 NaN 0.505 99
67l Northern Glaciated Limestone Ridges, Valleys, and Terraces 0 0.00 0.000000 0.0000000 50.5 NaN 0.505 99
67m Northern Glaciated Ridges 0 0.00 0.000000 0.0000000 50.5 NaN 0.505 99
68a Cumberland Plateau 2 1.50 1.318182 0.1883677 77.5 0.9652304 0.775 99
68b Sequatchie Valley 1 0.75 0.750000 0.0000000 50.5 NaN 0.505 99
68c Plateau Escarpment 2 1.50 1.318182 0.1883677 77.5 0.9652304 0.775 99
68d Southern Table Plateaus 1 0.75 0.750000 0.0000000 50.5 NaN 0.505 99
68e Dissected Plateau 0 0.00 0.000000 0.0000000 50.5 NaN 0.505 99
68f Shale Hills 0 0.00 0.000000 0.0000000 50.5 NaN 0.505 99
69a Forested Hills and Mountains 2 1.50 1.318182 0.1883677 77.5 0.9652304 0.775 99
69b Uplands and Valleys of Mixed Land Use 0 0.00 0.000000 0.0000000 50.5 NaN 0.505 99
69c Greenbriar Karst 2 1.50 1.318182 0.1883677 77.5 0.9652304 0.775 99
69d Dissected Appalachian Plateau 2 1.50 1.318182 0.1883677 77.5 0.9652304 0.775 99
69e Cumberland Mountain Thrust Block 2 1.50 1.318182 0.1883677 77.5 0.9652304 0.775 99
70a Permian Hills 0 0.00 0.000000 0.0000000 50.5 NaN 0.505 99
70b Monongahela Transition Zone 1 0.75 0.750000 0.0000000 50.5 NaN 0.505 99
70c Pittsburgh Low Plateau 0 0.00 0.000000 0.0000000 50.5 NaN 0.505 99
70d Knobs-Lower Scioto Dissected Plateau 0 0.00 0.000000 0.0000000 50.5 NaN 0.505 99
70e Unglaciated Upper Muskingum Basin 0 0.00 0.000000 0.0000000 50.5 NaN 0.505 99
70f Ohio/Kentucky Carboniferous Plateau 1 0.75 0.750000 0.0000000 50.5 NaN 0.505 99
70g Northern Forested Plateau Escarpment 0 0.00 0.000000 0.0000000 50.5 NaN 0.505 99
70h Carter Hills 0 0.00 0.000000 0.0000000 50.5 NaN 0.505 99
71a Crawford-Mammoth Cave Uplands 0 0.00 0.000000 0.0000000 50.5 NaN 0.505 99
71b Mitchell Plain 0 0.00 0.000000 0.0000000 50.5 NaN 0.505 99
71c Knobs-Norman Upland 0 0.00 0.000000 0.0000000 50.5 NaN 0.505 99
71d Outer Bluegrass 0 0.00 0.000000 0.0000000 50.5 NaN 0.505 99
71e Western Pennyroyal Karst Plain 0 0.00 0.000000 0.0000000 50.5 NaN 0.505 99
71f Western Highland Rim 0 0.00 0.000000 0.0000000 50.5 NaN 0.505 99
71g Eastern Highland Rim 2 1.50 1.318182 0.1883677 77.5 0.9652304 0.775 99
71h Outer Nashville Basin 1 0.75 0.750000 0.0000000 50.5 NaN 0.505 99
71i Inner Nashville Basin 0 0.00 0.000000 0.0000000 50.5 NaN 0.505 99
71j Little Mountain 0 0.00 0.000000 0.0000000 50.5 NaN 0.505 99
71k Hills of the Bluegrass 0 0.00 0.000000 0.0000000 50.5 NaN 0.505 99
71l Inner Bluegrass 0 0.00 0.000000 0.0000000 50.5 NaN 0.505 99
72a Wabash-Ohio Bottomlands 0 0.00 0.000000 0.0000000 50.5 NaN 0.505 99
72b Glaciated Wabash Lowlands 0 0.00 0.000000 0.0000000 50.5 NaN 0.505 99
72c Green River-Southern Wabash Lowlands 0 0.00 0.000000 0.0000000 50.5 NaN 0.505 99
72h Caseyville Hills 0 0.00 0.000000 0.0000000 50.5 NaN 0.505 99
72j Southern Illinoian Till Plain 0 0.00 0.000000 0.0000000 50.5 NaN 0.505 99
72m Wabash River Bluffs and Low Hills 0 0.00 0.000000 0.0000000 50.5 NaN 0.505 99
75j Sea Islands/Coastal Marsh 0 0.00 0.000000 0.0000000 50.5 NaN 0.505 99
83a Erie/Ontario Lake Plain 0 0.00 0.000000 0.0000000 50.5 NaN 0.505 99
84a Cape Cod/Long Island 0 0.00 0.000000 0.0000000 50.5 NaN 0.505 99
84b Pine Barrens 1 0.75 0.750000 0.0000000 50.5 NaN 0.505 99
84c Barrier Islands/Coastal Marshes 1 0.75 0.750000 0.0000000 50.5 NaN 0.505 99
84d Inner Coastal Plain 1 0.75 0.750000 0.0000000 50.5 NaN 0.505 99
kable(mpd_result) %>%
  kable_styling(bootstrap_options = c("striped", "hover", "condensed", font_size = 10)) %>%
  scroll_box(width = "100%", height = "200px")
ntaxa mpd.obs mpd.rand.mean mpd.rand.sd mpd.obs.rank mpd.obs.z mpd.obs.p runs
45a Southern Inner Piedmont 2 1.500000 1.171717 0.3790537 74.5 0.8660589 0.745 99
45b Southern Outer Piedmont 3 1.333333 1.193603 0.1900398 73.0 0.7352705 0.730 99
45c Carolina Slate Belt 3 1.333333 1.193603 0.1900398 73.0 0.7352705 0.730 99
45d Talladega Upland 1 NA NaN NA NA NA NA 99
45e Northern Inner Piedmont 2 1.500000 1.171717 0.3790537 74.5 0.8660589 0.745 99
45f Northern Outer Piedmont 2 1.000000 1.202020 0.3705979 30.5 -0.5451196 0.305 99
45g Triassic Basins 3 1.333333 1.193603 0.1900398 73.0 0.7352705 0.730 99
45i Kings Mountain 2 1.500000 1.171717 0.3790537 74.5 0.8660589 0.745 99
54a Illinois/Indiana Prairies 0 NA NaN NA NA NA NA 99
54b Chicago Lake Plain 0 NA NaN NA NA NA NA 99
54c Kankakee Marsh 0 NA NaN NA NA NA NA 99
54d Sand Area 0 NA NaN NA NA NA NA 99
54f Valparaiso-Wheaton Morainal Complex 0 NA NaN NA NA NA NA 99
55a Clayey High Lime Till Plains 0 NA NaN NA NA NA NA 99
55b Loamy High Lime Till Plains 0 NA NaN NA NA NA NA 99
55c Mad River Interlobate Area 0 NA NaN NA NA NA NA 99
55d Pre-Wisconsinan Drift Plains 0 NA NaN NA NA NA NA 99
55e Darby Plains 0 NA NaN NA NA NA NA 99
55f Whitewater Interlobate Area 0 NA NaN NA NA NA NA 99
56a Northern Indiana Lake Country 0 NA NaN NA NA NA NA 99
56b Battle Creek/Elkhart Outwash Plain 0 NA NaN NA NA NA NA 99
56c Middle Tippecanoe Plains 0 NA NaN NA NA NA NA 99
56d Michigan Lake Plain 0 NA NaN NA NA NA NA 99
57a Maumee Lake Plain 0 NA NaN NA NA NA NA 99
57b Oak Openings 0 NA NaN NA NA NA NA 99
57c Paulding Plains 0 NA NaN NA NA NA NA 99
57d Marblehead Drift/Limestone Plain 0 NA NaN NA NA NA NA 99
58b Western New England Marble Valleys 0 NA NaN NA NA NA NA 99
58e Berkshire Transition 0 NA NaN NA NA NA NA 99
58h Reading Prong 0 NA NaN NA NA NA NA 99
58i Glaciated Reading Prong/Hudson Highlands 0 NA NaN NA NA NA NA 99
58x Taconic Foothills 0 NA NaN NA NA NA NA 99
59a Connecticut Valley 1 NA NaN NA NA NA NA 99
59c Southern New England Coastal Plains and Hills 2 1.000000 1.131313 0.3818982 37.5 -0.3438433 0.375 99
59g Long Island Sound Coastal Lowland 2 1.000000 1.131313 0.3818982 37.5 -0.3438433 0.375 99
59i Hudson Valley 0 NA NaN NA NA NA NA 99
60a Glaciated Low Allegheny Plateau 0 NA NaN NA NA NA NA 99
60b Delaware-Neversink Highlands 0 NA NaN NA NA NA NA 99
61b Mosquito Creek/Pymatuning Lowlands 0 NA NaN NA NA NA NA 99
61c Low Lime Drift Plain 0 NA NaN NA NA NA NA 99
61d Erie Gorges 0 NA NaN NA NA NA NA 99
61e Summit Interlobate Area 0 NA NaN NA NA NA NA 99
62a Pocono High Plateau 0 NA NaN NA NA NA NA 99
62b Low Poconos 0 NA NaN NA NA NA NA 99
62c Glaciated High Allegheny Plateau 0 NA NaN NA NA NA NA 99
62d Unglaciated High Allegheny Plateau 0 NA NaN NA NA NA NA 99
63a Delaware River Terraces and Uplands 2 1.000000 1.131313 0.3818982 37.5 -0.3438433 0.375 99
63b Chesapeake-Pamlico Lowlands and Tidal Marshes 3 1.333333 1.144781 0.2124878 77.5 0.8873554 0.775 99
63c Swamps and Peatlands 2 1.500000 1.131313 0.3751589 78.0 0.9827486 0.780 99
63d Virginian Barrier Islands and Coastal Marshes 2 1.500000 1.131313 0.3751589 78.0 0.9827486 0.780 99
63e Mid-Atlantic Flatwoods 2 1.500000 1.131313 0.3751589 78.0 0.9827486 0.780 99
63f Delmarva Uplands 1 NA NaN NA NA NA NA 99
63g Carolinian Barrier Islands and Coastal Marshes 3 1.166667 1.164983 0.2095981 38.0 0.0080320 0.380 99
63h Carolina Flatwoods 3 1.166667 1.164983 0.2095981 38.0 0.0080320 0.380 99
63n Mid-Atlantic Floodplains and Low Terraces 2 1.500000 1.131313 0.3751589 78.0 0.9827486 0.780 99
64a Triassic Lowlands 2 1.000000 1.131313 0.3818982 37.5 -0.3438433 0.375 99
64b Trap Rock and Conglomerate Uplands 2 1.000000 1.131313 0.3818982 37.5 -0.3438433 0.375 99
64c Piedmont Uplands 1 NA NaN NA NA NA NA 99
64d Piedmont Limestone/Dolomite Lowlands 1 NA NaN NA NA NA NA 99
64e Glaciated Triassic Lowlands 1 NA NaN NA NA NA NA 99
64f Passaic Basin Freshwater Wetlands 1 NA NaN NA NA NA NA 99
64g Hackensack Meadowlands 1 NA NaN NA NA NA NA 99
65c Sand Hills 4 1.166667 1.166667 0.0000000 50.5 NaN 0.505 99
65i Fall Line Hills 0 NA NaN NA NA NA NA 99
65j Transition Hills 0 NA NaN NA NA NA NA 99
65l Atlantic Southern Loam Plains 3 1.166667 1.164983 0.2095981 38.0 0.0080320 0.380 99
65m Rolling Coastal Plain 4 1.166667 1.166667 0.0000000 50.5 NaN 0.505 99
65n Chesapeake Rolling Coastal Plain 2 1.000000 1.131313 0.3818982 37.5 -0.3438433 0.375 99
65p Southeastern Floodplains and Low Terraces 2 1.500000 1.207071 0.3572150 73.0 0.8200364 0.730 99
66a Northern Igneous Ridges 2 1.500000 1.171717 0.3790537 74.5 0.8660589 0.745 99
66b Northern Sedimentary and Metasedimentary Ridges 2 1.500000 1.171717 0.3790537 74.5 0.8660589 0.745 99
66c New River Plateau 2 1.500000 1.171717 0.3790537 74.5 0.8660589 0.745 99
66d Southern Crystalline Ridges and Mountains 2 1.500000 1.171717 0.3790537 74.5 0.8660589 0.745 99
66e Southern Sedimentary Ridges 2 1.500000 1.171717 0.3790537 74.5 0.8660589 0.745 99
66f Limestone Valleys and Coves 2 1.500000 1.171717 0.3790537 74.5 0.8660589 0.745 99
66g Southern Metasedimentary Mountains 2 1.500000 1.171717 0.3790537 74.5 0.8660589 0.745 99
66i High Mountains 2 1.500000 1.171717 0.3790537 74.5 0.8660589 0.745 99
66j Broad Basins 2 1.500000 1.171717 0.3790537 74.5 0.8660589 0.745 99
66k Amphibolite Mountains 2 1.500000 1.171717 0.3790537 74.5 0.8660589 0.745 99
66l Eastern Blue Ridge Foothills 2 1.500000 1.171717 0.3790537 74.5 0.8660589 0.745 99
66m Sauratown Mountains 2 1.500000 1.171717 0.3790537 74.5 0.8660589 0.745 99
67a Northern Limestone/Dolomite Valleys 2 1.500000 1.171717 0.3790537 74.5 0.8660589 0.745 99
67b Northern Shale Valleys 2 1.500000 1.171717 0.3790537 74.5 0.8660589 0.745 99
67c Northern Sandstone Ridges 2 1.500000 1.171717 0.3790537 74.5 0.8660589 0.745 99
67d Northern Dissected Ridges and Knobs 2 1.500000 1.171717 0.3790537 74.5 0.8660589 0.745 99
67e Anthracite Subregion 0 NA NaN NA NA NA NA 99
67f Southern Limestone/Dolomite Valleys and Low Rolling Hills 2 1.500000 1.171717 0.3790537 74.5 0.8660589 0.745 99
67g Southern Shale Valleys 2 1.500000 1.171717 0.3790537 74.5 0.8660589 0.745 99
67h Southern Sandstone Ridges 2 1.500000 1.171717 0.3790537 74.5 0.8660589 0.745 99
67i Southern Dissected Ridges and Knobs 2 1.500000 1.171717 0.3790537 74.5 0.8660589 0.745 99
67j Northern Glaciated Limestone Valleys 0 NA NaN NA NA NA NA 99
67k Northern Glaciated Shale and Slate Valleys 0 NA NaN NA NA NA NA 99
67l Northern Glaciated Limestone Ridges, Valleys, and Terraces 0 NA NaN NA NA NA NA 99
67m Northern Glaciated Ridges 0 NA NaN NA NA NA NA 99
68a Cumberland Plateau 2 1.500000 1.171717 0.3790537 74.5 0.8660589 0.745 99
68b Sequatchie Valley 1 NA NaN NA NA NA NA 99
68c Plateau Escarpment 2 1.500000 1.171717 0.3790537 74.5 0.8660589 0.745 99
68d Southern Table Plateaus 1 NA NaN NA NA NA NA 99
68e Dissected Plateau 0 NA NaN NA NA NA NA 99
68f Shale Hills 0 NA NaN NA NA NA NA 99
69a Forested Hills and Mountains 2 1.500000 1.171717 0.3790537 74.5 0.8660589 0.745 99
69b Uplands and Valleys of Mixed Land Use 0 NA NaN NA NA NA NA 99
69c Greenbriar Karst 2 1.500000 1.171717 0.3790537 74.5 0.8660589 0.745 99
69d Dissected Appalachian Plateau 2 1.500000 1.171717 0.3790537 74.5 0.8660589 0.745 99
69e Cumberland Mountain Thrust Block 2 1.500000 1.171717 0.3790537 74.5 0.8660589 0.745 99
70a Permian Hills 0 NA NaN NA NA NA NA 99
70b Monongahela Transition Zone 1 NA NaN NA NA NA NA 99
70c Pittsburgh Low Plateau 0 NA NaN NA NA NA NA 99
70d Knobs-Lower Scioto Dissected Plateau 0 NA NaN NA NA NA NA 99
70e Unglaciated Upper Muskingum Basin 0 NA NaN NA NA NA NA 99
70f Ohio/Kentucky Carboniferous Plateau 1 NA NaN NA NA NA NA 99
70g Northern Forested Plateau Escarpment 0 NA NaN NA NA NA NA 99
70h Carter Hills 0 NA NaN NA NA NA NA 99
71a Crawford-Mammoth Cave Uplands 0 NA NaN NA NA NA NA 99
71b Mitchell Plain 0 NA NaN NA NA NA NA 99
71c Knobs-Norman Upland 0 NA NaN NA NA NA NA 99
71d Outer Bluegrass 0 NA NaN NA NA NA NA 99
71e Western Pennyroyal Karst Plain 0 NA NaN NA NA NA NA 99
71f Western Highland Rim 0 NA NaN NA NA NA NA 99
71g Eastern Highland Rim 2 1.500000 1.171717 0.3790537 74.5 0.8660589 0.745 99
71h Outer Nashville Basin 1 NA NaN NA NA NA NA 99
71i Inner Nashville Basin 0 NA NaN NA NA NA NA 99
71j Little Mountain 0 NA NaN NA NA NA NA 99
71k Hills of the Bluegrass 0 NA NaN NA NA NA NA 99
71l Inner Bluegrass 0 NA NaN NA NA NA NA 99
72a Wabash-Ohio Bottomlands 0 NA NaN NA NA NA NA 99
72b Glaciated Wabash Lowlands 0 NA NaN NA NA NA NA 99
72c Green River-Southern Wabash Lowlands 0 NA NaN NA NA NA NA 99
72h Caseyville Hills 0 NA NaN NA NA NA NA 99
72j Southern Illinoian Till Plain 0 NA NaN NA NA NA NA 99
72m Wabash River Bluffs and Low Hills 0 NA NaN NA NA NA NA 99
75j Sea Islands/Coastal Marsh 0 NA NaN NA NA NA NA 99
83a Erie/Ontario Lake Plain 0 NA NaN NA NA NA NA 99
84a Cape Cod/Long Island 0 NA NaN NA NA NA NA 99
84b Pine Barrens 1 NA NaN NA NA NA NA 99
84c Barrier Islands/Coastal Marshes 1 NA NaN NA NA NA NA 99
84d Inner Coastal Plain 1 NA NaN NA NA NA NA 99
kable(mntd_result) %>%
  kable_styling(bootstrap_options = c("striped", "hover", "condensed", font_size = 10)) %>%
  scroll_box(width = "100%", height = "200px")
ntaxa mntd.obs mntd.rand.mean mntd.rand.sd mntd.obs.rank mntd.obs.z mntd.obs.p runs
45a Southern Inner Piedmont 2 1.5000000 1.1767677 0.3522025 76.0 0.9177457 0.760 99
45b Southern Outer Piedmont 3 1.1666667 0.9747475 0.2107696 74.0 0.9105639 0.740 99
45c Carolina Slate Belt 3 1.1666667 0.9747475 0.2107696 74.0 0.9105639 0.740 99
45d Talladega Upland 1 NA NaN NA NA NA NA 99
45e Northern Inner Piedmont 2 1.5000000 1.1767677 0.3522025 76.0 0.9177457 0.760 99
45f Northern Outer Piedmont 2 1.0000000 1.2020202 0.3971788 30.5 -0.5086380 0.305 99
45g Triassic Basins 3 1.1666667 0.9747475 0.2107696 74.0 0.9105639 0.740 99
45i Kings Mountain 2 1.5000000 1.1767677 0.3522025 76.0 0.9177457 0.760 99
54a Illinois/Indiana Prairies 0 NA NaN NA NA NA NA 99
54b Chicago Lake Plain 0 NA NaN NA NA NA NA 99
54c Kankakee Marsh 0 NA NaN NA NA NA NA 99
54d Sand Area 0 NA NaN NA NA NA NA 99
54f Valparaiso-Wheaton Morainal Complex 0 NA NaN NA NA NA NA 99
55a Clayey High Lime Till Plains 0 NA NaN NA NA NA NA 99
55b Loamy High Lime Till Plains 0 NA NaN NA NA NA NA 99
55c Mad River Interlobate Area 0 NA NaN NA NA NA NA 99
55d Pre-Wisconsinan Drift Plains 0 NA NaN NA NA NA NA 99
55e Darby Plains 0 NA NaN NA NA NA NA 99
55f Whitewater Interlobate Area 0 NA NaN NA NA NA NA 99
56a Northern Indiana Lake Country 0 NA NaN NA NA NA NA 99
56b Battle Creek/Elkhart Outwash Plain 0 NA NaN NA NA NA NA 99
56c Middle Tippecanoe Plains 0 NA NaN NA NA NA NA 99
56d Michigan Lake Plain 0 NA NaN NA NA NA NA 99
57a Maumee Lake Plain 0 NA NaN NA NA NA NA 99
57b Oak Openings 0 NA NaN NA NA NA NA 99
57c Paulding Plains 0 NA NaN NA NA NA NA 99
57d Marblehead Drift/Limestone Plain 0 NA NaN NA NA NA NA 99
58b Western New England Marble Valleys 0 NA NaN NA NA NA NA 99
58e Berkshire Transition 0 NA NaN NA NA NA NA 99
58h Reading Prong 0 NA NaN NA NA NA NA 99
58i Glaciated Reading Prong/Hudson Highlands 0 NA NaN NA NA NA NA 99
58x Taconic Foothills 0 NA NaN NA NA NA NA 99
59a Connecticut Valley 1 NA NaN NA NA NA NA 99
59c Southern New England Coastal Plains and Hills 2 1.0000000 1.1464646 0.3863257 36.0 -0.3791222 0.360 99
59g Long Island Sound Coastal Lowland 2 1.0000000 1.1464646 0.3863257 36.0 -0.3791222 0.360 99
59i Hudson Valley 0 NA NaN NA NA NA NA 99
60a Glaciated Low Allegheny Plateau 0 NA NaN NA NA NA NA 99
60b Delaware-Neversink Highlands 0 NA NaN NA NA NA NA 99
61b Mosquito Creek/Pymatuning Lowlands 0 NA NaN NA NA NA NA 99
61c Low Lime Drift Plain 0 NA NaN NA NA NA NA 99
61d Erie Gorges 0 NA NaN NA NA NA NA 99
61e Summit Interlobate Area 0 NA NaN NA NA NA NA 99
62a Pocono High Plateau 0 NA NaN NA NA NA NA 99
62b Low Poconos 0 NA NaN NA NA NA NA 99
62c Glaciated High Allegheny Plateau 0 NA NaN NA NA NA NA 99
62d Unglaciated High Allegheny Plateau 0 NA NaN NA NA NA NA 99
63a Delaware River Terraces and Uplands 2 1.0000000 1.1464646 0.3863257 36.0 -0.3791222 0.360 99
63b Chesapeake-Pamlico Lowlands and Tidal Marshes 3 1.1666667 0.9461279 0.2269140 75.5 0.9719045 0.755 99
63c Swamps and Peatlands 2 1.5000000 1.1111111 0.3680863 80.0 1.0565155 0.800 99
63d Virginian Barrier Islands and Coastal Marshes 2 1.5000000 1.1111111 0.3680863 80.0 1.0565155 0.800 99
63e Mid-Atlantic Flatwoods 2 1.5000000 1.1111111 0.3680863 80.0 1.0565155 0.800 99
63f Delmarva Uplands 1 NA NaN NA NA NA NA 99
63g Carolinian Barrier Islands and Coastal Marshes 3 0.8333333 0.9579125 0.2237885 39.0 -0.5566823 0.390 99
63h Carolina Flatwoods 3 0.8333333 0.9579125 0.2237885 39.0 -0.5566823 0.390 99
63n Mid-Atlantic Floodplains and Low Terraces 2 1.5000000 1.1111111 0.3680863 80.0 1.0565155 0.800 99
64a Triassic Lowlands 2 1.0000000 1.1464646 0.3863257 36.0 -0.3791222 0.360 99
64b Trap Rock and Conglomerate Uplands 2 1.0000000 1.1464646 0.3863257 36.0 -0.3791222 0.360 99
64c Piedmont Uplands 1 NA NaN NA NA NA NA 99
64d Piedmont Limestone/Dolomite Lowlands 1 NA NaN NA NA NA NA 99
64e Glaciated Triassic Lowlands 1 NA NaN NA NA NA NA 99
64f Passaic Basin Freshwater Wetlands 1 NA NaN NA NA NA NA 99
64g Hackensack Meadowlands 1 NA NaN NA NA NA NA 99
65c Sand Hills 4 0.8750000 0.8750000 0.0000000 50.5 NaN 0.505 99
65i Fall Line Hills 0 NA NaN NA NA NA NA 99
65j Transition Hills 0 NA NaN NA NA NA NA 99
65l Atlantic Southern Loam Plains 3 0.8333333 0.9579125 0.2237885 39.0 -0.5566823 0.390 99
65m Rolling Coastal Plain 4 0.8750000 0.8750000 0.0000000 50.5 NaN 0.505 99
65n Chesapeake Rolling Coastal Plain 2 1.0000000 1.1464646 0.3863257 36.0 -0.3791222 0.360 99
65p Southeastern Floodplains and Low Terraces 2 1.5000000 1.1818182 0.3673856 74.5 0.8660705 0.745 99
66a Northern Igneous Ridges 2 1.5000000 1.1767677 0.3522025 76.0 0.9177457 0.760 99
66b Northern Sedimentary and Metasedimentary Ridges 2 1.5000000 1.1767677 0.3522025 76.0 0.9177457 0.760 99
66c New River Plateau 2 1.5000000 1.1767677 0.3522025 76.0 0.9177457 0.760 99
66d Southern Crystalline Ridges and Mountains 2 1.5000000 1.1767677 0.3522025 76.0 0.9177457 0.760 99
66e Southern Sedimentary Ridges 2 1.5000000 1.1767677 0.3522025 76.0 0.9177457 0.760 99
66f Limestone Valleys and Coves 2 1.5000000 1.1767677 0.3522025 76.0 0.9177457 0.760 99
66g Southern Metasedimentary Mountains 2 1.5000000 1.1767677 0.3522025 76.0 0.9177457 0.760 99
66i High Mountains 2 1.5000000 1.1767677 0.3522025 76.0 0.9177457 0.760 99
66j Broad Basins 2 1.5000000 1.1767677 0.3522025 76.0 0.9177457 0.760 99
66k Amphibolite Mountains 2 1.5000000 1.1767677 0.3522025 76.0 0.9177457 0.760 99
66l Eastern Blue Ridge Foothills 2 1.5000000 1.1767677 0.3522025 76.0 0.9177457 0.760 99
66m Sauratown Mountains 2 1.5000000 1.1767677 0.3522025 76.0 0.9177457 0.760 99
67a Northern Limestone/Dolomite Valleys 2 1.5000000 1.1767677 0.3522025 76.0 0.9177457 0.760 99
67b Northern Shale Valleys 2 1.5000000 1.1767677 0.3522025 76.0 0.9177457 0.760 99
67c Northern Sandstone Ridges 2 1.5000000 1.1767677 0.3522025 76.0 0.9177457 0.760 99
67d Northern Dissected Ridges and Knobs 2 1.5000000 1.1767677 0.3522025 76.0 0.9177457 0.760 99
67e Anthracite Subregion 0 NA NaN NA NA NA NA 99
67f Southern Limestone/Dolomite Valleys and Low Rolling Hills 2 1.5000000 1.1767677 0.3522025 76.0 0.9177457 0.760 99
67g Southern Shale Valleys 2 1.5000000 1.1767677 0.3522025 76.0 0.9177457 0.760 99
67h Southern Sandstone Ridges 2 1.5000000 1.1767677 0.3522025 76.0 0.9177457 0.760 99
67i Southern Dissected Ridges and Knobs 2 1.5000000 1.1767677 0.3522025 76.0 0.9177457 0.760 99
67j Northern Glaciated Limestone Valleys 0 NA NaN NA NA NA NA 99
67k Northern Glaciated Shale and Slate Valleys 0 NA NaN NA NA NA NA 99
67l Northern Glaciated Limestone Ridges, Valleys, and Terraces 0 NA NaN NA NA NA NA 99
67m Northern Glaciated Ridges 0 NA NaN NA NA NA NA 99
68a Cumberland Plateau 2 1.5000000 1.1767677 0.3522025 76.0 0.9177457 0.760 99
68b Sequatchie Valley 1 NA NaN NA NA NA NA 99
68c Plateau Escarpment 2 1.5000000 1.1767677 0.3522025 76.0 0.9177457 0.760 99
68d Southern Table Plateaus 1 NA NaN NA NA NA NA 99
68e Dissected Plateau 0 NA NaN NA NA NA NA 99
68f Shale Hills 0 NA NaN NA NA NA NA 99
69a Forested Hills and Mountains 2 1.5000000 1.1767677 0.3522025 76.0 0.9177457 0.760 99
69b Uplands and Valleys of Mixed Land Use 0 NA NaN NA NA NA NA 99
69c Greenbriar Karst 2 1.5000000 1.1767677 0.3522025 76.0 0.9177457 0.760 99
69d Dissected Appalachian Plateau 2 1.5000000 1.1767677 0.3522025 76.0 0.9177457 0.760 99
69e Cumberland Mountain Thrust Block 2 1.5000000 1.1767677 0.3522025 76.0 0.9177457 0.760 99
70a Permian Hills 0 NA NaN NA NA NA NA 99
70b Monongahela Transition Zone 1 NA NaN NA NA NA NA 99
70c Pittsburgh Low Plateau 0 NA NaN NA NA NA NA 99
70d Knobs-Lower Scioto Dissected Plateau 0 NA NaN NA NA NA NA 99
70e Unglaciated Upper Muskingum Basin 0 NA NaN NA NA NA NA 99
70f Ohio/Kentucky Carboniferous Plateau 1 NA NaN NA NA NA NA 99
70g Northern Forested Plateau Escarpment 0 NA NaN NA NA NA NA 99
70h Carter Hills 0 NA NaN NA NA NA NA 99
71a Crawford-Mammoth Cave Uplands 0 NA NaN NA NA NA NA 99
71b Mitchell Plain 0 NA NaN NA NA NA NA 99
71c Knobs-Norman Upland 0 NA NaN NA NA NA NA 99
71d Outer Bluegrass 0 NA NaN NA NA NA NA 99
71e Western Pennyroyal Karst Plain 0 NA NaN NA NA NA NA 99
71f Western Highland Rim 0 NA NaN NA NA NA NA 99
71g Eastern Highland Rim 2 1.5000000 1.1767677 0.3522025 76.0 0.9177457 0.760 99
71h Outer Nashville Basin 1 NA NaN NA NA NA NA 99
71i Inner Nashville Basin 0 NA NaN NA NA NA NA 99
71j Little Mountain 0 NA NaN NA NA NA NA 99
71k Hills of the Bluegrass 0 NA NaN NA NA NA NA 99
71l Inner Bluegrass 0 NA NaN NA NA NA NA 99
72a Wabash-Ohio Bottomlands 0 NA NaN NA NA NA NA 99
72b Glaciated Wabash Lowlands 0 NA NaN NA NA NA NA 99
72c Green River-Southern Wabash Lowlands 0 NA NaN NA NA NA NA 99
72h Caseyville Hills 0 NA NaN NA NA NA NA 99
72j Southern Illinoian Till Plain 0 NA NaN NA NA NA NA 99
72m Wabash River Bluffs and Low Hills 0 NA NaN NA NA NA NA 99
75j Sea Islands/Coastal Marsh 0 NA NaN NA NA NA NA 99
83a Erie/Ontario Lake Plain 0 NA NaN NA NA NA NA 99
84a Cape Cod/Long Island 0 NA NaN NA NA NA NA 99
84b Pine Barrens 1 NA NaN NA NA NA NA 99
84c Barrier Islands/Coastal Marshes 1 NA NaN NA NA NA NA 99
84d Inner Coastal Plain 1 NA NaN NA NA NA NA 99

Plot PD results by community

ggplot() +
  geom_point(data = pd_result, aes(x = rownames(pd_result), y = pd.obs, col = pd.obs)) +
  ggtitle("PD values by ecoregion") +
  xlab("Community") + 
  ylab("PD") +
  theme(text = element_text(size=10), axis.text.x = element_text(angle = 90, hjust = 1))

Plot MPD results by community

ggplot()+
  geom_point(data = mpd_result, aes(x = rownames(mpd_result), y = mpd.obs, col = mpd.obs))+
  ggtitle("MPD values by ecoregion")+
  xlab("Community")+
  ylab("MPD")+
  theme(text = element_text(size=10), axis.text.x = element_text(angle = 90, hjust = 1))
## Warning: Removed 86 rows containing missing values (geom_point).

Plot MNTD results by community

ggplot()+
  geom_point(data = mntd_result, aes(x = rownames(mntd_result), y = mntd.obs, col = mntd.obs))+
  ggtitle("MNTD values by ecoregion")+
  xlab("Community")+
  ylab("MNTD")+
  theme(text = element_text(size=10), axis.text.x = element_text(angle = 90, hjust = 1))
## Warning: Removed 86 rows containing missing values (geom_point).